Re: FW: [HACKERS] Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Amit Kapila
 

> From: Gurjeet Singh [mailto:singh.gurj...@gmail.com] 
> Sent: Friday, July 13, 2012 4:24 AM
>>On Tue, Jul 10, 2012 at 2:40 AM, Amit Kapila 
wrote:

>>> Having to drop foreign key constraints before this command, and recreate
them afterwards makes this command useless to most database setups. I feel
sorry 

>>>  that no one brought this up when we were implementing the feature;
maybe we could've done something about it right then.

 

>>Will it impact user such that it will block its operation or something
similar or it is a usability issue?


> Yes, it will have to take an exclusive lock on the index, and possibly the
table too, but the operation should be quick to be even noticeable in low
load 

> conditions.

Which index you are referring here, is it primary key table index?

According to what I have debugged, the locks are taken on foreign key table,
constraint object and dependent triggers.



> However, if the x-lock is waiting for some other long running query to
finish, then lock queuing logic in Postgres will make new queries to wait
for this x-lock to > be taken and released before any new query can begin
processing. This is my recollection of the logic from an old conversation,
others can weigh in to confirm.

 

>>> Syntax options:

>>> ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING
INDEX new_index;

>>> ALTER INDEX ind REPLACE WITH new_index;

>>After this new syntax there will be 2 ways for users to do the replacement
of index, won't it confuse users for which syntax to use?

 

 

> Yes, I forgot to mention this in the original post. This feature will be a
superset of the feature we introduced in ALTER TABLE. I don't see a way
around that, 

> except for slowly deprecating the older feature. 

 

After new implementation, there will be no need to perform any operation for
table with foreign key and hence reduce the lock time for same as well.

However after implementation of Reindex Concurrently, this feature will also
needs to be deprecated which might not happen soon but still I feel it
should be considered whether providing new syntax and implementation is
really required by users.  

 

 

With Regards,

Amit Kapila.



Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Amit Kapila

> From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] 
> On Behalf Of Jose Ildefonso Camargo Tolosa
>>On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk  wrote:
> On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas 
wrote:
>

> As currently is, the point of: freezing the master because standby
> dies is not good for all cases (and I dare say: for most cases), and
> having to wait for pacemaker or other monitoring to note that, change
> master config and reload... it will cause a service disruption! (for
> several seconds, usually, ~30 seconds).

Yes, this is true that it can cause service disruption, but the same will be
True even if master detects that internally by having timeout. 
By keeping this as external, the current behavior of PostgreSQL can be
maintained that
if there is no standy in sync mode, it will wait and still serve the purpose
as externally it can send message for master.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-12 Thread Jeff Janes
I've moved this thread from performance to hackers.

The topic was poor performance when truncating lots of small tables
repeatedly on test environments with fsync=off.

On Thu, Jul 12, 2012 at 6:00 PM, Jeff Janes  wrote:

> I think the problem is in the Fsync Absorption queue.  Every truncate
> adds a FORGET_RELATION_FSYNC to the queue, and processing each one of
> those leads to sequential scanning the checkpointer's pending ops hash
> table, which is quite large.  It is almost entirely full of other
> requests which have already been canceled, but it still has to dig
> through them all.   So this is essentially an N^2 operation.

My attached Proof of Concept patch reduces the run time of the
benchmark at the end of this message from 650sec to 84sec,
demonstrating that this is in fact the problem.  Which doesn't mean
that my patch is the right answer to it, of course.

(The delete option is still faster than truncate, coming in at around 55sec)


> I'm not sure why we don't just delete the entry instead of marking it
> as cancelled.  It looks like the only problem is that you can't delete
> an entry other than the one just returned by hash_seq_search.  Which
> would be fine, as that is the entry that we would want to delete;
> except that mdsync might have a different hash_seq_search open, and so
> it wouldn't be safe to delete.
>
> If the segno was taken out of the hash key and handled some other way,
> then the forgetting could be done with a simple hash look up rather
> than a full scan.

The above two ideas might be the better solution, as they would work
even when fsync=on.  Since BBU are becoming so popular I think the
fsync queue could be a problem even with fsync on if the fsync is fast
enough.  But I don't immediately know how to implement them.

> Maybe we could just turn off the pending ops table altogether when
> fsync=off, but since fsync is PGC_SIGHUP it is not clear how you could
> safely turn it back on.

Now that I think about it, I don't see how turning fsync from off to
on can ever be known to be safe, until a system wide sync has
intervened.  After all a segment that was dirtied and added to the
pending ops table while fsync=off might also be removed from the
pending ops table the microsecond before fsync is turned on, so how is
that different from never adding it in the first place?

The attached Proof Of Concept patch implements this in two ways, one
of which is commented out.  The commented out way omits the overhead
of sending the request to the checkpointer in the first place, but
breaks modularity a bit.

The benchmark used on 9.3devel head is:

fsync=off, all other defaults.

## one time initialization
perl -le 'print "create schema foo$_; create table foo$_.foo$_ (k
integer, v integer);" $ARGV[0]..$ARGV[0]+$ARGV[1]-1' 0 10 |psql

## actual benchmark.
perl -le 'print "set client_min_messages=warning;";
foreach (1..1) {
print "BEGIN;\n";
print "insert into foo$_.foo$_ select * from
generate_series(1,10); " foreach $ARGV[0]..$ARGV[0]+$ARGV[1]-1;
print "COMMIT;\nBEGIN;\n";
print "truncate table foo$_.foo$_; " foreach
$ARGV[0]..$ARGV[0]+$ARGV[1]-1;
#print "delete from foo$_.foo$_; " foreach
$ARGV[0]..$ARGV[0]+$ARGV[1]-1;
print "COMMIT;\n"
   }  ' 0 10 | time psql > /dev/null

Cheers,

Jeff


fsync_queue_POC.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Etsuro Fujita
Hi Hanada-san,

> It would be possible to add some more features, such as ORDER BY
> push-down with index information support, without changing existing
> APIs, but at first add relatively simple pgsql_fdw and enhance it seems
> better.  In addition, once pgsql_fdw has been merged, it would help
> implementing proof-of-concept of SQL/MED-related features.

I agree with on this point.  However, I think it is preferable that pgsql_fdw
should support, from the start, the push down of PARAM_EXEC params, and thus the
parameter-change-driven remote-rescanning functionality for that.  I think that
such a functionality is necessary for pgsql_fdw to efficiently process SubLinks
on remote tables, and to realize parameterized scans in future, which I think
will be proof-of-concept code to demonstrate how to enhance FDWs to developers,
as discussed in the earlier thread of IMAP FDW...

Thanks,

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Kohei KaiGai
2012/7/13 Shigeru HANADA :
> (2012/07/12 20:48), Kohei KaiGai wrote:
>> It seems to me what postgresql_fdw_validator() is doing looks like
>> a function to be named as "libpq_fdw_validator()".
>>
>> How about your opinion? It will help this namespace conflicts.
>
> I'd prefer dblink_fdw_validator.
>
> The name "libpq_fdw_validator" impresses me that a concrete FDW named
> "libpq_fdw" is somewhere and it retrieves external data *from* libpq.
> Indeed postgresql_fdw_validator allows only some of libpq options at the
> moment, but we won't be able to rename it for backward compatibility
> even if it wants to have non-libpq options in the future.
>
> IMO basically each FDW validator should be owned by a particular FDW,
> because in most cases validator should know FDW's internal deeply.  In
> addition, it would want to have new options for new features.
>
> Besides naming, as mentioned upthread, removing hard-coded libpq options
> list from dblink and leaving it to libpq client library would make
> dblink more robust about libpq option changes in future.
>
OK, it seems to me fair enough.

Does someone have different opinions?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Tom Lane
Joe Conway  writes:
> Is it still worth backpatching a change to has_language_privilege as a
> bug fix?

I think back-patching a behavioral change in this area is probably a
bad idea.  We can fix it (in one way or another) going forward, but
changing this sort of thing in a minor release seems likely to have
more bad consequences than good ones.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file foreign tables

2012-07-12 Thread Etsuro Fujita
Thanks!

Best regards,
Etsuro Fujita

> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: Friday, July 13, 2012 5:30 AM
> To: Etsuro Fujita
> Cc: 'Kohei KaiGai'; 'Robert Haas'; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file
> foreign tables
> 
> "Etsuro Fujita"  writes:
> > Besides the revision, I modified check_selective_binary_conversion() to run
> > heap_close() in the whole-row-reference case.  Attached is an updated
version
> of
> > the patch.
> 
> Applied with minor, mostly-cosmetic revisions.  I did fix
> check_selective_binary_conversion to not continue touching the
> relation's tupledesc after heap_close.  Also I thought
> "convert_selectively" was a better name for the hidden COPY option.
> 
>   regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 4:10 PM, Shaun Thomas  wrote:
> On 07/12/2012 12:02 PM, Bruce Momjian wrote:
>
>> Well, the problem also exists if add it as an internal database
>> feature --- how long do we wait to consider the standby dead, how do
>> we inform administrators, etc.
>
>
> True. Though if there is no secondary connected, either because it's not
> there yet, or because it disconnected, that's an easy check. It's the
> network lag/stall detection that's tricky.

Well, yes... but how does PostgreSQL currently note its "main
synchronous standby" went away and that it have to use another standby
and synchronous?  How long does it takes it to note that?

>
>
>> I don't think anyone says the feature is useless, but is isn't going
>> to be a simple boolean either.
>
>
> Oh $Deity no. I'd never suggest that. I just tend to be overly verbose, and
> sometimes my intent gets lost in the rambling as I try to explain my
> perspective. I apologize if it somehow came across that anyone could just
> flip a switch and have it work.
>
> My C is way too rusty, or I'd be writing an extension right now to do this,
> or be looking over that patch I linked to originally to make suitable
> adaptations. I know I talk about how relatively handy DRBD is, but it's also
> a gigantic PITA since it has to exist underneath the actual filesystem. :)
>
>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
> 312-444-8534
> stho...@optionshouse.com
>
>
>
> __
>
> See http://www.peak6.com/email_disclaimer/ for terms and conditions related
> to this email
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:29 PM, Aidan Van Dyk  wrote:
> On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa
>
>> Yeah, you need that with PostgreSQL, but no with DRBD, for example
>> (sorry, but DRBD is one of the flagships of HA things in the Linux
>> world).  Also, I'm not convinced about the "2nd standby" thing... I
>> mean, just read this on the docs, which is a little alarming:
>>
>> "If primary restarts while commits are waiting for acknowledgement,
>> those waiting transactions will be marked fully committed once the
>> primary database recovers. There is no way to be certain that all
>> standbys have received all outstanding WAL data at time of the crash
>> of the primary. Some transactions may not show as committed on the
>> standby, even though they show as committed on the primary. The
>> guarantee we offer is that the application will not receive explicit
>> acknowledgement of the successful commit of a transaction until the
>> WAL data is known to be safely received by the standby."
>>
>> So... there is no *real* warranty here either... I don't know how I
>> skipped that paragraph before today I mean, this implies that it
>> is possible that a transaction could be marked as commited on the
>> master, but the app was not informed on that (and thus, could try to
>> send it again), and the transaction was NOT applied on the standby
>> how can this happen? I mean, when the master comes back, shouldn't the
>> standby get the missing WAL pieces from the master and then apply the
>> transaction? The standby part is the one that I don't really get, on
>> the application side... well, there are several ways in which you can
>> miss the "commit confirmation": connection issues in the worst moment,
>> and the such, so, I guess it is not *so* serious, and the app should
>> have a way of checking its last transaction if it lost connectivity to
>> server before getting the transaction commited.
>
> But you already have that in a single server situation as well.  There
> is a window between when the commit is "durable" (and visible to
> others, and will be committed after recovery of a crash), when the
> client doesn't yet know it's committed (and might never get the commit
> message due to server crash, network disconnect, client middle-tier
> crash, etc).
>
> So people are already susceptible to that, and defending against it, no? ;-)

Right.  What I'm saying is that particular part on the docs:

"If primary restarts while commits are waiting for acknowledgement,
those waiting transactions will be marked fully committed once the
primary database recovers. "()"Some transactions may not show as
committed on the standby, even though they show as committed on the
primary."(...)

See? it sounds like, after the primary database recovers, the standby
will still not have the transaction committed, and as far as I thought
I knew, the standby should get that over the WAL stream from master
once it reconnects to it.

>
> And they are susceptible to that if they are on PostgreSQL, Oracle, MS
> SQL, DB2, etc.

Certainly.  That's why I said:

(...)"The standby part is the one that I don't really get, on
the application side... well, there are several ways in which you can
miss the "commit confirmation": connection issues in the worst moment,
and the such, so, I guess it is not *so* serious, and the app should
have a way of checking its last transaction if it lost connectivity to
server before getting the transaction commited."

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Shigeru HANADA
(2012/07/12 20:48), Kohei KaiGai wrote:
> It seems to me what postgresql_fdw_validator() is doing looks like
> a function to be named as "libpq_fdw_validator()".
> 
> How about your opinion? It will help this namespace conflicts.

I'd prefer dblink_fdw_validator.

The name "libpq_fdw_validator" impresses me that a concrete FDW named
"libpq_fdw" is somewhere and it retrieves external data *from* libpq.
Indeed postgresql_fdw_validator allows only some of libpq options at the
moment, but we won't be able to rename it for backward compatibility
even if it wants to have non-libpq options in the future.

IMO basically each FDW validator should be owned by a particular FDW,
because in most cases validator should know FDW's internal deeply.  In
addition, it would want to have new options for new features.

Besides naming, as mentioned upthread, removing hard-coded libpq options
list from dblink and leaving it to libpq client library would make
dblink more robust about libpq option changes in future.

Regards,
-- 
Shigeru HANADA



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Joe Conway
On 07/12/2012 02:53 PM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> As long as we're spending time on this, I'd propose getting rid of
>> lanplistrusted, at least for access checking.  Instead, just don't
>> install USAGE privileges by default for those languages.
> 
> There's definitely something to that idea --- certainly lanpltrusted
> dates from before we had a robust object-permissions system, and looks
> like a bit of a wart now that we do have one.
> 
> I guess we could redefine the default privileges for languages as "none"
> and then have the TRUSTED keyword mean to install public usage
> privilege.  Or maybe it would be safer for upgrade purposes if we kept
> the default interpretation as-is and did an automatic REVOKE when
> TRUSTED wasn't specified.

+1

I'll take a look at the latter option sometime in the next few weeks and
submit for the next commitfest.

Is it still worth backpatching a change to has_language_privilege as a
bug fix?

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa

> Yeah, you need that with PostgreSQL, but no with DRBD, for example
> (sorry, but DRBD is one of the flagships of HA things in the Linux
> world).  Also, I'm not convinced about the "2nd standby" thing... I
> mean, just read this on the docs, which is a little alarming:
>
> "If primary restarts while commits are waiting for acknowledgement,
> those waiting transactions will be marked fully committed once the
> primary database recovers. There is no way to be certain that all
> standbys have received all outstanding WAL data at time of the crash
> of the primary. Some transactions may not show as committed on the
> standby, even though they show as committed on the primary. The
> guarantee we offer is that the application will not receive explicit
> acknowledgement of the successful commit of a transaction until the
> WAL data is known to be safely received by the standby."
>
> So... there is no *real* warranty here either... I don't know how I
> skipped that paragraph before today I mean, this implies that it
> is possible that a transaction could be marked as commited on the
> master, but the app was not informed on that (and thus, could try to
> send it again), and the transaction was NOT applied on the standby
> how can this happen? I mean, when the master comes back, shouldn't the
> standby get the missing WAL pieces from the master and then apply the
> transaction? The standby part is the one that I don't really get, on
> the application side... well, there are several ways in which you can
> miss the "commit confirmation": connection issues in the worst moment,
> and the such, so, I guess it is not *so* serious, and the app should
> have a way of checking its last transaction if it lost connectivity to
> server before getting the transaction commited.

But you already have that in a single server situation as well.  There
is a window between when the commit is "durable" (and visible to
others, and will be committed after recovery of a crash), when the
client doesn't yet know it's committed (and might never get the commit
message due to server crash, network disconnect, client middle-tier
crash, etc).

So people are already susceptible to that, and defending against it, no? ;-)

And they are susceptible to that if they are on PostgreSQL, Oracle, MS
SQL, DB2, etc.

a.


-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 12:17 PM, Bruce Momjian  wrote:
> On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote:
>> > From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org]
>> > On Behalf Of Jose Ildefonso Camargo Tolosa
>>
>> > Please, stop arguing on all of this: I don't think that adding an
>> > option will hurt anybody (specially because the work was already done
>> > by someone), we are not asking to change how the things work, we just
>> > want an option to decided whether we want it to freeze on standby
>> > disconnection, or if we want it to continue automatically... is that
>> > asking so much?
>>
>> I think this kind of decision should be done from outside utility or
>> scripts.
>> It would be better if from outside it can be detected that stand-by is down
>> during sync replication, and send command to master to change its mode or
>> change settings appropriately without stopping master.
>> Putting this kind of more and more logic into replication code will make it
>> more cumbersome.
>
> We certainly would need something external to inform administrators that
> the system is no longer synchronous.

That is *mandatory*, just as you monitor DRBD, or disk arrays: if a
disk fail, and alert have to be issued, to fix it as soon as possible.

But such alerts can wait 30 seconds to be sent out, so, any monitoring
system would be able to handle that, we just need to get current
system status from the monitoring system, and create corresponding
rules: a simple matter, actually.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk  wrote:
> On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas  
> wrote:
>
>> So far as transaction durability is concerned... we have a continuous
>> background rsync over dark fiber for archived transaction logs, DRBD for
>> block-level sync, filesystem snapshots for our backups, a redundant async DR
>> cluster, an offsite backup location, and a tape archival service stretching
>> back for seven years. And none of that will cause the master to stop
>> processing transactions unless the master itself dies and triggers a
>> failover.
>
> Right, so if the dark fiber between New Orleans and Seattle (pick two
> places for your datacenter) happens to be the first thing failing in
> your NO data center.  Disconenct the sync-ness, and continue.  Not a
> problem, unless it happens to be Aug 29, 2005.
>
> You have lost data.  Maybe only a bit.  Maybe it wasn't even
> important.  But that's not for PostgreSQL to decide.

I never asked for it... but, you (the one who is configuring the
system) can decide, and should be able to decide... right now: we
can't decide.

>
> But because your PG on DRDB "continued" when it couldn't replicate to
> Seattle, it told it's clients the data was durable, just minutes
> before the whole DC was under water.

Yeah, well, what is the probability of all of that?... really tiny.  I
bet it is more likely that you win the lottery, than all of these
events happening within that time frame.  But, risking monetary loses
because, for example, the online store stopped accepting orders while
the standby server went down, that's not acceptable for some companies
(and some companies just can't buy 3 x DB servers, or more!).

>
> OK, so a wise admin team would have removed the NO DC from it's
> primary role days before that hit.
>
> Change the NO to NYC and the date Sept 11, 2001.
>
> OK, so maybe we can concede that these types of major catasrophies are
> more devestating to us than loosing some data.
>
> Now your primary server was in AWS US East last week.  It's sync slave
> was in the affected AZ, but your PG primary continues on, until, since
> it was a EC2 instance, it disappears.  Now where is your data?

Who would *really* trust your PostgreSQL DB to EC2?... I mean, the I/O
is not very good, and the price is not exactly that low so that you
take that risk.

All in all: you are still getting together coincidences that have *so
low* probability

>
> Or the fire marshall orders the data center (or whole building) EPO,
> and the connection to your backup goes down minutes before your
> servers or other network peers.
>
>> Using PG sync in its current incarnation would introduce an extra failure
>> scenario that wasn't there before. I'm pretty sure we're not the only ones
>> avoiding it for exactly that reason. Our queue discards messages it can't
>> fulfil within ten seconds and then throws an error for each one. We need to
>> decouple the secondary as quickly as possible if it becomes unresponsive,
>> and there's really no way to do that without something in the database, one
>> way or another.
>
> It introduces an "extra failure", because it has introduce an "extra
> data durability guarantee".
>
> Sure, many people don't *really* want that data durability guarantee,
> even though they would like the "maybe guaranteed" version of it.
>
> But that fine line is actually a difficult (impossible?) one to define
> if you don't know, at the moment of decision, what the next few
> moments will/could become.

You *never* know.  And the truth is that you have to make the decision
with what you have, if you can pay 10 servers nationwide: good for
you, not all of us can afford that (men, I could barely pay for two,
and that because I *know* I don't want to risk to lose the data or
service because the single server died).

As currently is, the point of: freezing the master because standby
dies is not good for all cases (and I dare say: for most cases), and
having to wait for pacemaker or other monitoring to note that, change
master config and reload... it will cause a service disruption! (for
several seconds, usually, ~30 seconds).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:35 AM, Dimitri Fontaine
 wrote:
> Hi,
>
> Jose Ildefonso Camargo Tolosa  writes:
>> environments.  And no, it doesn't makes synchronous replication
>> meaningless, because it will work synchronous if it have someone to
>> sync to, and work async (or standalone) if it doesn't: that's perfect
>> for HA environment.
>
> You seem to want Service Availibility when we are providing Data
> Availibility. I'm not saying you shouldn't ask what you're asking, just
> that it is a different need.

Yes, and no: I don't see why we can't have and option to choose which
one we want.  I can see the point of "data availability": it is better
freeze the service, than risk losing transactions... however, try to
explain that to some managers: "well, you know, the DB server froze
the whole bank system because, well, the standby server died, and we
didn't want to risk transaction loss, we just froze the master you
know, in case the master were to die too before the we had a reliable
standby."  I don't think a manager would really understand why you
would block the whole company's system, just because *the standby*
server died (and why you don't block it, when the master dies?!).
Now, maybe that's a bad example, I know a bank should have at least 3
or 4 servers, with some of them in different geographical areas, but
just think on the typical boss.

In "Service Availability", you have data Availability most of the
time, until one of the servers fails (if you have just 2 nodes), what
if you have more than two: well, good for you!  But, you can keep
going with a single server, understanding that you are in a high risk,
that have to be fixed real soon (emergency).

>
> If you troll the archives, you will see that this debate has received
> much consideration already. The conclusion is that if you care about
> Service Availibility you should have 2 standby servers and set them both
> as candidates to being the synchronous one.

That's more cost, and for most applications: it doesn't worth the extra cost.

Really, I see the point you have, and I have *never* asked to remove
the data warranties, but to have an option to relax it, if the
particular situation requires it: "enough safety" for a given cost.

>
> That way, when you lose one standby the service is unaffected, the
> second standby is now the synchronous one, and it's possible to
> re-attach the failed standby live, with or without archiving (with is
> preferred so that the master isn't involved in the catch-up phase).
>
>> As synchronous standby currently is, it just doesn't fit the HA usage,
>
> It does actually allow both data high availability and service high
> availability, provided that you feed at least two standbys.

Still, doesn't fit.  You need to spend more hardware, and more power
(and money there), and more carbon footprint, . you get the point,
also, having 3 servers for your DB can be necessary (and possible) for
some companies, but for others: no.

>
> What you seem to be asking is both data and service high availability
> with only two nodes. You're right that we can not provide that with
> current releases of PostgreSQL. I'm not sure anyone has a solid plan to
> make that happen.
>
>> and if you really want to keep it that way, it doesn't belong to the
>> HA chapter on the pgsql documentation, and should be moved.  And NO
>> async replication will *not* work for HA, because the master can have
>> more transactions than standby, and if the master crashes, the standby
>> will have no way to recover these transactions, with synchronous
>> replication we have *exactly* what we need: the data in the standby,
>> after all, it will apply it once we promote it.
>
> Exactly. We want data availability first. Service availability is
> important too, and for that you need another standby.

Yeah, you need that with PostgreSQL, but no with DRBD, for example
(sorry, but DRBD is one of the flagships of HA things in the Linux
world).  Also, I'm not convinced about the "2nd standby" thing... I
mean, just read this on the docs, which is a little alarming:

"If primary restarts while commits are waiting for acknowledgement,
those waiting transactions will be marked fully committed once the
primary database recovers. There is no way to be certain that all
standbys have received all outstanding WAL data at time of the crash
of the primary. Some transactions may not show as committed on the
standby, even though they show as committed on the primary. The
guarantee we offer is that the application will not receive explicit
acknowledgement of the successful commit of a transaction until the
WAL data is known to be safely received by the standby."

So... there is no *real* warranty here either... I don't know how I
skipped that paragraph before today I mean, this implies that it
is possible that a transaction could be marked as commited on the
master, but the app was not informed on that (and thus, could try to
send it again), and the transaction wa

Re: [HACKERS] Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Gurjeet Singh
On Tue, Jul 10, 2012 at 11:11 AM, Greg Stark  wrote:

> On Tue, Jul 10, 2012 at 3:44 PM, Tom Lane  wrote:
> >> The problem you describe is one of constraints and dependencies and
> >> not one of indexes. It seems what you really want is a way to alter
> >> foreign key dependencies to depend on a new index. Either an explicit
> >> command that lets you set the new dependency or what seems even better
> >> would be to have DROP INDEX check any dependent objects to see if
> >> there's another index that can satisfy them and change their
> >> dependency.
> >
> > Either of these have exactly the same issue, namely their correctness
> > depends on determining if two indexes have identical properties.
>
> This doesn't sound right to me. In these cases all it would have to
> know about is the same set of properties that CREATE CONSTRAINT looks
> for to find a satisfactory index to depend on.
>

I like the DROP index idea, but the silent side-effect may not make people
happy. Can you give me a pointer to relevant code.

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Gurjeet Singh
On Tue, Jul 10, 2012 at 10:44 AM, Tom Lane  wrote:

> Greg Stark  writes:
> > On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh 
> wrote:
> >> All we need to do is allow swapping of pg_class.relfilenode of two
> indexes.
>
> > Fwiw I don't like swapping relfilenodes on indexes the user created.
> > REINDEX currently does this but it's a bit of a hack and only works
> > because reindex carefully builds the new index with exactly the same
> > definition as the old one.
>
> Yes.  The swap-relfilenodes operation would have to carefully check that
> the index definitions were exactly equivalent, and there would be a
> constant risk for bugs of omission if that code weren't taught about
> any new index properties we invent.
>

IMHO there must be many other places in this code-base where we run that
risk.

The way I am planning to do it was to compare all relevant fields of the
FormData_pg_index. And I am assuming anybody changing the struct members
will take care of relevant changes needed for this code too.

We can add a runtime/compile-time assert to make sure that
Natts_pg_index==17. That way, if a new column gets added, we will get
alerted promptly.


> All of these things seem like ugly, hard-to-use kluges anyway (the
> make-sure-the-indexes-match business is just as much of a PITA for the
> DBA as it is for the system).  What we really want is REINDEX
> CONCURRENTLY.
>

+1, but I can't take on that task.

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: FW: [HACKERS] Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Gurjeet Singh
On Tue, Jul 10, 2012 at 2:40 AM, Amit Kapila  wrote:

>  > Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT
> ... USING INDEX we added back in the day is not so useful in the field. **
> **
>
> > Having to drop foreign key constraints before this command, and
> recreate them afterwards makes this command useless to most database
> setups. I feel sorry 
>
> > that no one brought this up when we were implementing the feature;
> maybe we could've done something about it right then.
>
> ** **
>
> Will it impact user such that it will block its operation or something
> similar or it is a usability issue?
>

Yes, it will have to take an exclusive lock on the index, and possibly the
table too, but the operation should be quick to be even noticeable in low
load conditions.

However, if the x-lock is waiting for some other long running query to
finish, then lock queuing logic in Postgres will make new queries to wait
for this x-lock to be taken and released before any new query can begin
processing. This is my recollection of the logic from an old conversation,
others can weigh in to confirm.


> 
>
>
>
> > All we need to do is allow swapping of pg_class.relfilenode of two
> indexes. This will let the dependency entries stand as they are and allow
> us to drop the 
>
> > bloated primary key index structure without having to rebuild the
> foreign key constraints.
>
> ** **
>
> I have noticed is that currently Oid and pg_class.relfilenode are same for
> user created tables and indexes. But after your implementation that will
> not remain same, I am not sure whether it can impact any other path of
> code.
>

They start off as same, but some operations, like REINDEX, changes the
relfilenode; that's the purpose of relfilenode: to map the oid to a
filename on disk.


> 
>
>
>
> >As for the syntactical sugar, this can be added to either ALTER TABLE or
> to ALTER INDEX. Although under no normal circumstances one would need to
> use >ALTER INDEX to swap two indexes' relfilenode (because one can easily
> create a duplicate index and drop/rename-in-place the old one), I think it
> would make > more sense here since it is just an operation on two indexes
> and has nothing to do with the constraints, apart from the fact that we
> want to use this feature to 
>
> > meddle with the constraints.
>
> > Syntax options:
>
> > ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE}
> USING INDEX new_index;
>
> > ALTER INDEX ind REPLACE WITH new_index;
>
> After this new syntax there will be 2 ways for users to do the replacement
> of index, won’t it confuse users for which syntax to use?
>

Yes, I forgot to mention this in the original post. This feature will be a
superset of the feature we introduced in ALTER TABLE. I don't see a way
around that, except for slowly deprecating the older feature.

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Tom Lane
Peter Eisentraut  writes:
> As long as we're spending time on this, I'd propose getting rid of
> lanplistrusted, at least for access checking.  Instead, just don't
> install USAGE privileges by default for those languages.

There's definitely something to that idea --- certainly lanpltrusted
dates from before we had a robust object-permissions system, and looks
like a bit of a wart now that we do have one.

I guess we could redefine the default privileges for languages as "none"
and then have the TRUSTED keyword mean to install public usage
privilege.  Or maybe it would be safer for upgrade purposes if we kept
the default interpretation as-is and did an automatic REVOKE when
TRUSTED wasn't specified.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Peter Eisentraut
On tor, 2012-07-12 at 01:40 -0400, Tom Lane wrote:
> So it seems arguably reasonable to me for has_language_privilege()
> to take superuserness and lanpltrusted into account, without thereby
> concluding that other privilege() functions must do more than they
> do today.  If we don't want it to do that, then I think we ought to
> offer some other function that *does* consider those things ... but
> I'm not seeing the value of separating it out.

As long as we're spending time on this, I'd propose getting rid of
lanplistrusted, at least for access checking.  Instead, just don't
install USAGE privileges by default for those languages.

The reason is that there is value in having a role that can deploy
schemas, possibly containing functions in untrusted languages, without
having to be a full superuser.  Just like you can have a user that can
create roles without being a superuser.

> The sepgsql point is worth discussing too.  I have not been paying
> close attention to the sepgsql patches, but I have the distinct
> impression that they create a non-examinable privilege barrier,
> ie there's no way to inquire whether you have the privilege to do
> X except by actually trying it.  Is that really the way we want
> things to go?

Well, that's how SELinux works too.  You can inspect the labels and all
that, but nobody really knows what's going to happen until you try it.

Which is ultimately the recommended way anyway.  has_*_privilege is a
bit like the access() function, which has caveats associated with it.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiler warnings on the buildfarm

2012-07-12 Thread Peter Eisentraut
On sön, 2012-07-01 at 19:04 +0200, Stefan Kaltenbrunner wrote:
> seeing some of the latest commits about fixing compiler warnings I
> took a look at the buildfarm to see if there are any interesting ones
> there (in total we have a thousends of warnings on the buildfarm but
> most of those are from very noisy compilers).
> 
> so in case anybody is interested those are a selection of the ones
> that at least look somewhat interesting(duplicates mostly removed,
> windows ignored):

Many of these come from ancient compilers, and from minor versions that
are not even the latest for that ancient major release.  They're mostly
not worth worrying about, because evidently the compiler developers
later improved the compilers to not warn about these cases anymore.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-07-12 Thread Peter Eisentraut
On lör, 2012-06-23 at 23:08 +0100, Dean Rasheed wrote:
> I spotted a couple of other issues during testing:
> 
> * You're still allowing INCLUDING DEFAULTS and INCLUDING STORAGE, even
> though these options are not supported on foreign tables.
> 
> * If I do INCLUDING ALL, I get an error because of the unsupported
> options. I think that "ALL" in this context needs to be made to mean
> all the options that foreign tables support (just COMMENTS at the
> moment).

Note that when I added CREATE TABLE LIKE to support composite types, it
was decided to ignore non-applicable options (like copying indexes from
types or views etc.).  The same should be done here, unless we have
reasons to revise the earlier decision.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sign mismatch in walreceiver.c

2012-07-12 Thread Peter Eisentraut
This looks suspicious

static TimeLineID   recvFileTLI = -1;

because TimeLineID is uint32.  The Solaris compiler complains about the
sign mismatch.

Maybe 0 would be a better initial value?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas

On 07/12/2012 12:02 PM, Bruce Momjian wrote:


Well, the problem also exists if add it as an internal database
feature --- how long do we wait to consider the standby dead, how do
we inform administrators, etc.


True. Though if there is no secondary connected, either because it's not 
there yet, or because it disconnected, that's an easy check. It's the 
network lag/stall detection that's tricky.



I don't think anyone says the feature is useless, but is isn't going
to be a simple boolean either.


Oh $Deity no. I'd never suggest that. I just tend to be overly verbose, 
and sometimes my intent gets lost in the rambling as I try to explain my 
perspective. I apologize if it somehow came across that anyone could 
just flip a switch and have it work.


My C is way too rusty, or I'd be writing an extension right now to do 
this, or be looking over that patch I linked to originally to make 
suitable adaptations. I know I talk about how relatively handy DRBD is, 
but it's also a gigantic PITA since it has to exist underneath the 
actual filesystem. :)


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com



__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file foreign tables

2012-07-12 Thread Tom Lane
"Etsuro Fujita"  writes:
> Besides the revision, I modified check_selective_binary_conversion() to run
> heap_close() in the whole-row-reference case.  Attached is an updated version 
> of
> the patch.

Applied with minor, mostly-cosmetic revisions.  I did fix
check_selective_binary_conversion to not continue touching the
relation's tupledesc after heap_close.  Also I thought
"convert_selectively" was a better name for the hidden COPY option.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP pgindent replacement

2012-07-12 Thread Bruce Momjian
On Wed, Jun 22, 2011 at 10:16:28AM -0400, Bruce Momjian wrote:
> Andrew Dunstan wrote:
> > >> Further research shows that C89 explicitly dropped support for the old
> > >> K&R "=-" operator, so we probably *should* remove this in case it
> > >> introduces an unintended bug.
> > > Well, the point is if someone does use that, it isn't going to generate
> > > a pgindent error, but rather produce incorrect C code because =- is
> > > going to be changed.  FYI, my gcc 2.95.3 allows =- and does work as
> > > intended.
> > >
> > 
> > As intended by whom? If the effect of "x=4; x =- 1;" is to subtract 1 
> > from x then that's simply wrong by C89. It should assign -1 to x. The 
> > "=-" must be parsed as two operators in C89, assignment and unary minus. 
> > pgindent should not under any circumstances change the semantics of the 
> > program being indented, and that's what this transformation does for 
> > compilers conforming to the standard we explicitly follow.
> > 
> > What happens when your ancient gcc is told to apply the ansi standard?
> 
> I see now that my test wasn't complete.  You are right it assigns -1 so
> we can remove this from pgindent.

Per report form last year, removed from pgindent.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-12 Thread Tom Lane
Here is a revised version of the timeout-infrastructure patch.
I whacked it around quite a bit, notably:

* I decided that the most convenient way to handle the initialization
issue was to combine establishment of the signal handler with resetting
of the per-process variables.  So handle_sig_alarm is no longer global,
and InitializeTimeouts is called at the places where we used to do
"pqsignal(SIGALRM, handle_sig_alarm);".  I believe this is correct
because any subprocess that was intending to use SIGALRM must have
called that before establishing any timeouts.

* BTW, doing that exposed the fact that walsender processes were failing
to establish a SIGALRM signal handler, which is a pre-existing bug,
because they run a normal authentication transaction during InitPostgres
and hence need to be able to cope with deadlock and statement timeouts.
I will do something about back-patching a fix for that.

* I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
done in walsender and autovacuum processes.  I'm not totally satisfied
with that, but for sure it didn't work to only establish them in
regular backends.

* I didn't like the "TimeoutName" nomenclature, because to me "name"
suggests that the value is a string, not just an enum.  So I renamed
that to TimeoutId.

* I whacked around the logic in timeout.c a fair amount, because it
had race conditions if SIGALRM happened while enabling or disabling
a timeout.  I believe the attached coding is safe, but I'm not totally
happy with it from a performance standpoint, because it will do two
setitimer calls (a disable and then a re-enable) in many cases where
the old code did only one.

I think what we ought to do is go ahead and apply this, so that we
can have the API nailed down, and then we can revisit the internals
of timeout.c to see if we can't get the performance back up.
It's clearly a much cleaner design than the old spaghetti logic in
proc.c, so I think we ought to go ahead with this independently of
whether the second patch gets accepted.

I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.

regards, tom lane



binwIymnjnW5K.bin
Description: 1-timeout-framework-v16.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 08:21:08AM -0500, Shaun Thomas wrote:
> >But, putting that aside, why not write a piece of middleware that
> >does precisely this, or whatever you want? It can live on the same
> >machine as Postgres and ack synchronous commit when nobody is home,
> >and notify (e.g. page) you in the most precise way you want if nobody
> >is home "for a while".
> 
> You're right that there are lots of ways to kinda get this ability,
> they're just not mature enough or capable enough to really matter.
> Tailing the log to watch for secondary disconnect is too slow. Monit
> or Nagios style checks are too slow and unreliable. A custom-built
> middle-layer (a master-slave plugin for Pacemaker, for example) is
> too slow. All of these would rely on some kind of check interval.
> Set that too high, and we get 10,000xn missed transactions for n
> seconds. Too low, and we'd increase the likelihood of false
> positives and unnecessary detachments.

Well, the problem also exists if add it as an internal database feature
--- how long do we wait to consider the standby dead, how do we inform
administrators, etc.

I don't think anyone says the feature is useless, but is isn't going to
be a simple boolean either.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote:
> > From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] 
> > On Behalf Of Jose Ildefonso Camargo Tolosa
> 
> > Please, stop arguing on all of this: I don't think that adding an
> > option will hurt anybody (specially because the work was already done
> > by someone), we are not asking to change how the things work, we just
> > want an option to decided whether we want it to freeze on standby
> > disconnection, or if we want it to continue automatically... is that
> > asking so much?
> 
> I think this kind of decision should be done from outside utility or
> scripts.
> It would be better if from outside it can be detected that stand-by is down
> during sync replication, and send command to master to change its mode or
> change settings appropriately without stopping master.
> Putting this kind of more and more logic into replication code will make it
> more cumbersome.

We certainly would need something external to inform administrators that
the system is no longer synchronous.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Magnus Hagander
On Thu, Jul 12, 2012 at 6:07 PM, Fujii Masao  wrote:
> On Thu, Jul 12, 2012 at 8:39 PM, Magnus Hagander  wrote:
>> On Tue, Jul 10, 2012 at 7:03 PM, Fujii Masao  wrote:
>>> On Tue, Jul 10, 2012 at 3:23 AM, Fujii Masao  wrote:
 Hi,

 I found several problems in pg_receivexlog, e.g., memory leaks,
 file-descripter leaks, ..etc. The attached patch fixes these problems.

 ISTM there are still some other problems in pg_receivexlog, so I'll
 read it deeply later.
>>>
>>> While pg_basebackup background process is streaming WAL records,
>>> if its replication connection is terminated (e.g., walsender in the server
>>> is accidentally terminated by SIGTERM signal), pg_basebackup ends
>>> up failing to include all required WAL files in the backup. The problem
>>> is that, in this case, pg_basebackup doesn't emit any error message at all.
>>> So an user might misunderstand that a base backup has been successfully
>>> taken even though it doesn't include all required WAL files.
>>
>> Ouch. That is definitely a bug if it behaves that way.
>>
>>
>>> To fix this problem, I think that, when the replication connection is
>>> terminated, ReceiveXlogStream() should check whether we've already
>>> reached the stop point by calling stream_stop() before returning TRUE.
>>> If we've not yet (this means that we've not received all required WAL
>>> files yet), ReceiveXlogStream() should return FALSE and
>>> pg_basebackup should emit an error message.  Comments?
>>
>> Doesn't it already return false because it detects the error of the
>> connection? What's the codepath where we end up returning true even
>> though we had a connection failure? Shouldn't that end up under the
>> "could not read copy data" branch, which already returns false?
>
> You're right. If the error is detected, that function always returns false
> and the error message is emitted (but I think that current error message
> "pg_basebackup: child process exited with error 1" is confusing),
> so it's OK. But if walsender in the server is terminated by SIGTERM,
> no error is detected and pg_basebackup background process gets out
> of the loop in ReceiveXlogStream() and returns true.

Oh. Because the server does a graceful shutdown. D'uh, of course.

Then yes, your suggested fix seems like a good one.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Fujii Masao
On Thu, Jul 12, 2012 at 8:39 PM, Magnus Hagander  wrote:
> On Tue, Jul 10, 2012 at 7:03 PM, Fujii Masao  wrote:
>> On Tue, Jul 10, 2012 at 3:23 AM, Fujii Masao  wrote:
>>> Hi,
>>>
>>> I found several problems in pg_receivexlog, e.g., memory leaks,
>>> file-descripter leaks, ..etc. The attached patch fixes these problems.
>>>
>>> ISTM there are still some other problems in pg_receivexlog, so I'll
>>> read it deeply later.
>>
>> While pg_basebackup background process is streaming WAL records,
>> if its replication connection is terminated (e.g., walsender in the server
>> is accidentally terminated by SIGTERM signal), pg_basebackup ends
>> up failing to include all required WAL files in the backup. The problem
>> is that, in this case, pg_basebackup doesn't emit any error message at all.
>> So an user might misunderstand that a base backup has been successfully
>> taken even though it doesn't include all required WAL files.
>
> Ouch. That is definitely a bug if it behaves that way.
>
>
>> To fix this problem, I think that, when the replication connection is
>> terminated, ReceiveXlogStream() should check whether we've already
>> reached the stop point by calling stream_stop() before returning TRUE.
>> If we've not yet (this means that we've not received all required WAL
>> files yet), ReceiveXlogStream() should return FALSE and
>> pg_basebackup should emit an error message.  Comments?
>
> Doesn't it already return false because it detects the error of the
> connection? What's the codepath where we end up returning true even
> though we had a connection failure? Shouldn't that end up under the
> "could not read copy data" branch, which already returns false?

You're right. If the error is detected, that function always returns false
and the error message is emitted (but I think that current error message
"pg_basebackup: child process exited with error 1" is confusing),
so it's OK. But if walsender in the server is terminated by SIGTERM,
no error is detected and pg_basebackup background process gets out
of the loop in ReceiveXlogStream() and returns true.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Andrew Dunstan


On 07/12/2012 10:01 AM, Tom Lane wrote:


FWIW, I think you could save a level of naming if you were willing to
put the type first, since the type would imply whether the object
lives in a schema or not:

[type]/[name].sql
[type]/[schema]/[name].sql






That will destroy the property of having everything for a given schema 
collected together.


Arguably we should make a special case for the create statement of a 
schema, but I'm not even sure about that.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Joel Jacobson
On Thu, Jul 12, 2012 at 4:01 PM, Tom Lane  wrote:

> FWIW, I think you could save a level of naming if you were willing to
> put the type first, since the type would imply whether the object
> lives in a schema or not:
>
> [type]/[name].sql
> [type]/[schema]/[name].sql
>
>
Could work. But I think it's more relevant and useful to keep all objects
in a schema in its own directory.

That way it's easier to get an overview of what's in a schema,
simply by looking at the file structure of the schema directory.

I think its more common you want to "show all objects within schema X"
than "show all schemas of type X".

PS.

I was thinking -- the guys back in the 70s must have spent a lot of time
thinking about the UNIX directory structure -- before they decided upon it.

I did some googling and found found this explanation which was quite
amusing to say the least :-)

http://lists.busybox.net/pipermail/busybox/2010-December/074114.html


Re: [HACKERS] Schema version management

2012-07-12 Thread Tom Lane
Joel Jacobson  writes:
> On Thursday, July 12, 2012, Tom Lane wrote:
>> What are you going to do with objects that don't have schemas?
>> (Including, but not restricted to, the schemas themselves.)

> Good question. Maybe something like this?

> For objects without schema:
> /global/[type]/[name].sql

> For objects with schema:
> /schema/[schema]/[type]/[name].sql

FWIW, I think you could save a level of naming if you were willing to
put the type first, since the type would imply whether the object
lives in a schema or not:

[type]/[name].sql
[type]/[schema]/[name].sql

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas  wrote:

> So far as transaction durability is concerned... we have a continuous
> background rsync over dark fiber for archived transaction logs, DRBD for
> block-level sync, filesystem snapshots for our backups, a redundant async DR
> cluster, an offsite backup location, and a tape archival service stretching
> back for seven years. And none of that will cause the master to stop
> processing transactions unless the master itself dies and triggers a
> failover.

Right, so if the dark fiber between New Orleans and Seattle (pick two
places for your datacenter) happens to be the first thing failing in
your NO data center.  Disconenct the sync-ness, and continue.  Not a
problem, unless it happens to be Aug 29, 2005.

You have lost data.  Maybe only a bit.  Maybe it wasn't even
important.  But that's not for PostgreSQL to decide.

But because your PG on DRDB "continued" when it couldn't replicate to
Seattle, it told it's clients the data was durable, just minutes
before the whole DC was under water.

OK, so a wise admin team would have removed the NO DC from it's
primary role days before that hit.

Change the NO to NYC and the date Sept 11, 2001.

OK, so maybe we can concede that these types of major catasrophies are
more devestating to us than loosing some data.

Now your primary server was in AWS US East last week.  It's sync slave
was in the affected AZ, but your PG primary continues on, until, since
it was a EC2 instance, it disappears.  Now where is your data?

Or the fire marshall orders the data center (or whole building) EPO,
and the connection to your backup goes down minutes before your
servers or other network peers.

> Using PG sync in its current incarnation would introduce an extra failure
> scenario that wasn't there before. I'm pretty sure we're not the only ones
> avoiding it for exactly that reason. Our queue discards messages it can't
> fulfil within ten seconds and then throws an error for each one. We need to
> decouple the secondary as quickly as possible if it becomes unresponsive,
> and there's really no way to do that without something in the database, one
> way or another.

It introduces an "extra failure", because it has introduce an "extra
data durability guarantee".

Sure, many people don't *really* want that data durability guarantee,
even though they would like the "maybe guaranteed" version of it.

But that fine line is actually a difficult (impossible?) one to define
if you don't know, at the moment of decision, what the next few
moments will/could become.

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Joel Jacobson
On Thursday, July 12, 2012, Tom Lane wrote:
>
> What are you going to do with objects that don't have schemas?
> (Including, but not restricted to, the schemas themselves.)
>

Good question. Maybe something like this?

For objects without schema:
/global/[type]/[name].sql

For objects with schema:
/schema/[schema]/[type]/[name].sql


Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts

2012-07-12 Thread Heikki Linnakangas

On 09.07.2012 11:35, Shigeru HANADA wrote:

Once the issues above are fixed, IMO this patch can be marked as "Ready
for committer".


Thanks. The docs on async connections says:

"The connect_timeout connection parameter is ignored when using 
PQconnectPoll; it is the application's responsibility to decide whether 
an excessive amount of time has elapsed."


I think we should recommend using PQconnectTimeout(), similar to what 
you did in psql, in all client applications that use the non-blocking 
connection API. Perhaps something like:


"The connect_timeout connection parameter is ignored when using 
PQconnectPoll; it is the application's responsibility to decide whether 
an excessive amount of time has elapsed. It is recommended to use the 
PQconnectTimeout() to get value of the parameter, and use that as the 
timeout in the application. That way the user gets the same timeout 
behavior, regardless of whether the application uses PQconnectPoll or 
the nonblocking connection API."


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Tom Lane
Joel Jacobson  writes:
> I think the directory structure [schema]/[type]/[name] should be the same
> for all object types. I don't like "operator" being part of the filename,
> it should be the directory name.

What are you going to do with objects that don't have schemas?
(Including, but not restricted to, the schemas themselves.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas

On 07/12/2012 12:31 AM, Daniel Farina wrote:


But RAID-1 as nominally seen is a fundamentally different problem,
with much tinier differences in latency, bandwidth, and connectivity.
Perhaps useful for study, but to suggest the problem is *that* similar
I think is wrong.


Well, yes and no. One of the reasons I brought up DRBD was because it's 
basically RAID-1 over a network interface. It's not without overhead, 
but a few basic pgbench tests show it's still 10-15% faster than a 
synchronous PG setup for two servers in the same rack. Greg Smith's 
tests show that beyond a certain point, a synchronous PG setup 
effectively becomes untenable simply due to network latency in the 
protocol implementation. In reality, it probably wouldn't be usable 
beyond two servers in different datacenters in the same city.


RAID-1 was the model for DRBD, but I brought it up only because it's 
pretty much the definition of a synchronous commit that degrades 
gracefully. I'd even suggest it's more important in a network context 
than for RAID-1, because you're far more likely to get sync 
interruptions due to network issues than you are for a disk to fail.



But, putting that aside, why not write a piece of middleware that
does precisely this, or whatever you want? It can live on the same
machine as Postgres and ack synchronous commit when nobody is home,
and notify (e.g. page) you in the most precise way you want if nobody
is home "for a while".


You're right that there are lots of ways to kinda get this ability, 
they're just not mature enough or capable enough to really matter. 
Tailing the log to watch for secondary disconnect is too slow. Monit or 
Nagios style checks are too slow and unreliable. A custom-built 
middle-layer (a master-slave plugin for Pacemaker, for example) is too 
slow. All of these would rely on some kind of check interval. Set that 
too high, and we get 10,000xn missed transactions for n seconds. Too 
low, and we'd increase the likelihood of false positives and unnecessary 
detachments.


If it's possible through a PG 9.x extension, that'd probably be the way 
to *safely* handle it as a bolt-on solution. If the original author of 
the patch can convert it to such a beast, we'd install it approximately 
five seconds after it finished compiling.


So far as transaction durability is concerned... we have a continuous 
background rsync over dark fiber for archived transaction logs, DRBD for 
block-level sync, filesystem snapshots for our backups, a redundant 
async DR cluster, an offsite backup location, and a tape archival 
service stretching back for seven years. And none of that will cause the 
master to stop processing transactions unless the master itself dies and 
triggers a failover.


Using PG sync in its current incarnation would introduce an extra 
failure scenario that wasn't there before. I'm pretty sure we're not the 
only ones avoiding it for exactly that reason. Our queue discards 
messages it can't fulfil within ten seconds and then throws an error for 
each one. We need to decouple the secondary as quickly as possible if it 
becomes unresponsive, and there's really no way to do that without 
something in the database, one way or another.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com



__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG9.2 and FDW query planning.

2012-07-12 Thread Ronan Dunklau
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/07/2012 18:30, Tom Lane wrote:
> Ronan Dunklau  writes:
>> Let's say I have an IMAP foreign data wrapper, and I write a
>> query joining the table on itself using the In-Reply-To and
>> Message-ID headers, is there anything I can do to avoid fetching
>> all the mails from the remote server ?
> 
>> If I could somehow inform the planner that it can look up rows
>> by message-id, thus avoiding the need to fetch everything from
>> the remote server. Perhaps "persuading" the planner to use a
>> nested-loop ?
> 
> OK, so what you're saying is that the imap server can effectively 
> provide an index on message_id.  What you'd do is create a
> parameterized path that uses the tbl.message_id =
> other_tbl.in_reply_to join clause. If that's enough cheaper than a
> full scan, the planner would select it.

Thank you, I was able to build such paths from your indication.

The python FDW implementor can optionally give a list of tuples
consisting of (path key, expected_number_of_row). So, in the imap
example that could be [('Message-ID', 1), ('From', 1000)] for example.

- From this information, if there is an equivalence class which
restrictinfo uses one of those keys, we build a parameterized path,
with an associated cost of base_width * expected_number_of_row, in
addition to the generic, unparameterized path.

The planner can then select this path, and build plans looking like this:

postgres=# explain select m1."From",
   m1."To",
   m2."From",
   m2."To"
from mails m1 inner join mails m2 on m2."Message-ID" = m1."In-Reply-To"
where m1."From" = '%t...@example.com%';

QUERY PLAN
- 
 Nested Loop  (cost=10.00..60001000.00 rows=5 width=128)
   ->  Foreign Scan on mails m1  (cost=0.00..3000.00 rows=10
width=300)
 Filter: (("From")::text = '%t...@example.com%'::text)
   ->  Foreign Scan on mails m2  (cost=10.00..300.00 rows=1 width=300)
 Filter: (("Message-ID")::text = (m1."In-Reply-To")::text)


If I understand it correctly, after returning a ForeignScan (from
GetForeignPlan), the planner decides to use a nestloop, and in the
process of creating the nestloop plan, replaces Var nodes coming from
the outerel (here, m1."In-Reply-To") by params nodes.

My current implementation already looks for (var = param) expressions
that it may handle  during the plan phase and stores the association
between the var and the param_id.
At execution time, the needed parameters values are fetched (from the
ParamExecData array found in es_param_exec_vals) and passed to the
python foreign data wrapper.

The problem I have: how can I retrieve the generated params and keep
the association between the var and the param ?

Should I replace the (var = outervar) clauses by (var = param) myself
and store them in the fdw_exprs field of the foreign scan ?

> FWIW, I'm not sure that it's sane to try to expose this stuff to
> python yet.  It's complicated and still something of a moving
> target.  Once we've got a few more C-coded FDWs that can do this
> type of optimization, things might be stable enough that it'd be
> useful to try to provide a high-level API.

The current API (as mentioned above) would be more declarative than
anything, only offering a way to (maybe) build parameterized paths
without guaranteeing anything. Even if the internals change, I fail to
see how it can hurt to offer such a feature.


Regards,

- -- 
Ronan Dunklau


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQEcBAEBAgAGBQJP/s2dAAoJECTYLCgFy3239KkIAIiKJo/F1r4Yp49wLpmThjQI
ICo910ZajqlUKVsl9ye8m2l6p+lyGEmZMWUAWP6ae2pqFR+aC0zThypjF1faZ9tN
HfqMbEKx/trkDf05U28tJlvOeu21tiEOEs4n02fmfdHu9SvemuLdyhU3dOLxoBVK
ZZ8ra9q/+zHCPpc3zt0Mow80Q1X1M3DtirsHPoeIdOK69wD4nD2ZfhQule5HaoV1
dG3FlrKGAGzRpohLBCuWzyGPcWCS584lXGWfhsz/waLaSDIjcjvaaMke54eaa8Ci
7KxXkMM12CKFQyheSR5VVwFJrobnME2HDiJCoAOkRc0dW+Y2aASJnKG/FwL8C7s=
=4RjB
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Dimitri Fontaine
Hi,

Jose Ildefonso Camargo Tolosa  writes:
> environments.  And no, it doesn't makes synchronous replication
> meaningless, because it will work synchronous if it have someone to
> sync to, and work async (or standalone) if it doesn't: that's perfect
> for HA environment.

You seem to want Service Availibility when we are providing Data
Availibility. I'm not saying you shouldn't ask what you're asking, just
that it is a different need.

If you troll the archives, you will see that this debate has received
much consideration already. The conclusion is that if you care about
Service Availibility you should have 2 standby servers and set them both
as candidates to being the synchronous one.

That way, when you lose one standby the service is unaffected, the
second standby is now the synchronous one, and it's possible to
re-attach the failed standby live, with or without archiving (with is
preferred so that the master isn't involved in the catch-up phase).

> As synchronous standby currently is, it just doesn't fit the HA usage,

It does actually allow both data high availability and service high
availability, provided that you feed at least two standbys.

What you seem to be asking is both data and service high availability
with only two nodes. You're right that we can not provide that with
current releases of PostgreSQL. I'm not sure anyone has a solid plan to
make that happen.

> and if you really want to keep it that way, it doesn't belong to the
> HA chapter on the pgsql documentation, and should be moved.  And NO
> async replication will *not* work for HA, because the master can have
> more transactions than standby, and if the master crashes, the standby
> will have no way to recover these transactions, with synchronous
> replication we have *exactly* what we need: the data in the standby,
> after all, it will apply it once we promote it.

Exactly. We want data availability first. Service availability is
important too, and for that you need another standby.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Kohei KaiGai
2012/7/12 Shigeru HANADA :
> (2012/07/12 6:04), Peter Eisentraut wrote:
>> On tor, 2012-06-14 at 21:29 +0900, Shigeru HANADA wrote:
>>> I'd like to propose pgsql_fdw, FDW for PostgreSQL, as a contrib module
>>> in core, again.
>>
>> Do you have any new proposals regarding naming, and how to deal with
>> postgresql_fdw_validator, and dblink?
>
> Yes, I've proposed to rename existing postgresql_fdw_validator to
> dblink_fdw_validator and move it into contrib/dblink so that pgsql_fdw
> can use the name "postgresql_fdw" and "postgresql_fdw_validator".
> Though this post has had no response...
>
It seems to me what postgresql_fdw_validator() is doing looks like
a function to be named as "libpq_fdw_validator()".

How about your opinion? It will help this namespace conflicts.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Magnus Hagander
On Tue, Jul 10, 2012 at 7:03 PM, Fujii Masao  wrote:
> On Tue, Jul 10, 2012 at 3:23 AM, Fujii Masao  wrote:
>> Hi,
>>
>> I found several problems in pg_receivexlog, e.g., memory leaks,
>> file-descripter leaks, ..etc. The attached patch fixes these problems.
>>
>> ISTM there are still some other problems in pg_receivexlog, so I'll
>> read it deeply later.
>
> While pg_basebackup background process is streaming WAL records,
> if its replication connection is terminated (e.g., walsender in the server
> is accidentally terminated by SIGTERM signal), pg_basebackup ends
> up failing to include all required WAL files in the backup. The problem
> is that, in this case, pg_basebackup doesn't emit any error message at all.
> So an user might misunderstand that a base backup has been successfully
> taken even though it doesn't include all required WAL files.

Ouch. That is definitely a bug if it behaves that way.


> To fix this problem, I think that, when the replication connection is
> terminated, ReceiveXlogStream() should check whether we've already
> reached the stop point by calling stream_stop() before returning TRUE.
> If we've not yet (this means that we've not received all required WAL
> files yet), ReceiveXlogStream() should return FALSE and
> pg_basebackup should emit an error message.  Comments?

Doesn't it already return false because it detects the error of the
connection? What's the codepath where we end up returning true even
though we had a connection failure? Shouldn't that end up under the
"could not read copy data" branch, which already returns false?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Magnus Hagander
On Tue, Jul 10, 2012 at 6:45 PM, Fujii Masao  wrote:
> On Tue, Jul 10, 2012 at 6:27 AM, Magnus Hagander  wrote:
>> On Mon, Jul 9, 2012 at 8:23 PM, Fujii Masao  wrote:
>>> Hi,
>>>
>>> I found several problems in pg_receivexlog, e.g., memory leaks,
>>> file-descripter leaks, ..etc. The attached patch fixes these problems.
>>
>> While I don't doubt that what you've found are real problems, would
>> you mind explaining exactly what they are, so we don't have to
>> reverse-engineer the patch to figure that out?
>
> Yep.
>
> When an error happens after replication connection has been established,
> pg_receivexlog doesn't close an open file descriptor and release an allocated
> memory area. This was harmless before 16282ae688de2b320cf176e9be8a89e4dfc60698
> because pg_receivexlog exits immediately when an error happens. But
> currently in an error case, pg_receivexlog tries reconnecting to the server
> infinitely, so file descriptors and memory would leak. I think this is problem
> and should be fixed. The patch which I submitted yesterday changes
> pg_receivexlog so that it closes the open file and frees the memory area
> before reconnecting to the server.

Thanks. I get it now, and this explains why I didn't see it before - I
didn't check properly after we added the loop mode. Patch applied with
minor changes (e.g. there's no point in doing PQfinish(tmpconn) right
after you've verified tmpconn is NULL)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Shigeru HANADA
(2012/07/12 6:04), Peter Eisentraut wrote:
> On tor, 2012-06-14 at 21:29 +0900, Shigeru HANADA wrote:
>> I'd like to propose pgsql_fdw, FDW for PostgreSQL, as a contrib module
>> in core, again.
> 
> Do you have any new proposals regarding naming, and how to deal with
> postgresql_fdw_validator, and dblink?

Yes, I've proposed to rename existing postgresql_fdw_validator to
dblink_fdw_validator and move it into contrib/dblink so that pgsql_fdw
can use the name "postgresql_fdw" and "postgresql_fdw_validator".
Though this post has had no response...

http://archives.postgresql.org/message-id/4f854f43.4030...@gmail.com

Regards,
-- 
Shigeru HANADA



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Etsuro Fujita
> 2012/6/26 Kohei KaiGai :
> > Harada-san,
> >
> > I checked your patch, and had an impression that includes many
> > improvements from the previous revision that I looked at the last
> > commit fest.
> >
> > However, I noticed several points to be revised, or investigated.
> >
> > * It seems to me expected results of the regression test is not
> >   attached, even though test cases were included. Please add it.

KaiGai-san, Did you find the file?  That is in the contrib/pgsql_fdw/expected
directory, named pgsql_fdw.out.  If necessary, I will send the file to you.
BTW, I found some bugs on the expected results of the file.  Attached is a patch
fixing the bugs.

Thanks,

Best regards,
Etsuro Fujita



pgsql_fdw_regress.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scheduled back-branch releases?

2012-07-12 Thread Magnus Hagander
On Thu, Jul 12, 2012 at 11:41 AM, Bernd Helmle  wrote:
> Do we have a schedule for when next back-branch releases are packaged (i hope 
> i
> didn't miss any announcement...)?

No, there is no such schedule (yet).

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Scheduled back-branch releases?

2012-07-12 Thread Bernd Helmle
Do we have a schedule for when next back-branch releases are packaged (i hope i
didn't miss any announcement...)?

-- 
Thanks

Bernd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-12 Thread Heikki Linnakangas

On 07.07.2012 00:12, Jan Urbański wrote:

On 06/07/12 22:47, Peter Eisentraut wrote:

On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote:

What shall we do about those? Ignore them? Document that if you're sing
one of these encodings then PL/Python with Python 2 will be crippled
and
with Python 3 just won't work?


We could convert to UTF-8, and use the PostgreSQL functions to convert
from UTF-8 to the server encoding. Double conversion might be slow, but
I think it would be better than failing.


Actually, we already do the other direction that way
(PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to
always use this.

I would hesitate to use this as a kind of fallback, because then we
would sometimes be using PostgreSQL's recoding tables and sometimes
Python's recoding tables, which could became confusing.


So you're in favour of doing unicode -> bytes by encoding with UTF-8 and
then using the server's encoding functions?


Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2 
should be quite fast, and it would be good to be consistent in the way 
we do conversions in both directions.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for XLogRecPtr in expand_fmt_string?

2012-07-12 Thread Heikki Linnakangas

On 07.07.2012 01:03, Peter Eisentraut wrote:

On tis, 2012-07-03 at 14:52 -0400, Tom Lane wrote:

Peter Eisentraut  writes:

On tis, 2012-07-03 at 19:35 +0200, Andres Freund wrote:

I wonder if we just should add a format code like %R or something similar as a
replacement for the %X/%X notion.



Maybe just print it as a single 64-bit value from now on.


That'd be problematic also, because of the lack of standardization of
the format code for uint64.  We could write things like
"message... " UINT64_FORMAT " ...more message"
but I wonder how well the translation tools would work with that;
and anyway it would at least double the translation effort for
messages containing such things.


The existing uses of INT64_FORMAT and UINT64_FORMAT show how this is
done:  You print the value in a temporary buffer and use %s in the final
string.  It's not terribly pretty, but it's been done this way forever,
including in xlog code, so there shouldn't be a reason to hesitate about
the use for this particular case.


That's hardly any simpler than what we have now.

On 03.07.2012 21:09, Tom Lane wrote:
> Andres Freund  writes:
>> I wonder if we just should add a format code like %R or something 
similar as a

>> replacement for the %X/%X notion.
>
> Only if you can explain how to teach gcc what it means for elog argument
> match checking.  %m is a special case in that it matches up with a
> longstanding glibc-ism that gcc knows about.  Adding format codes of our
> own invention would be problematic.

One idea would be to use a macro, like this:

#define XLOGRECPTR_FMT_ARGS(recptr) (uint32) ((recptr) >> 32), (uint32) 
(recptr)


elog(LOG, "current WAL location is %X/%X", XLOGRECPTR_FMT_ARGS(RecPtr));

One downside is that at first glance, that elog() looks broken, because 
the number of arguments don't appear to match the format string.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Logging both start and end of temporary file usage

2012-07-12 Thread Christophe Pettus
In working with s a client to analyze their temp file usage, it became useful 
to know when a temporary file was created as well as when it was closed.  That 
way, we could process the logs to determine a high water mark of overall temp 
file usage, to know how high it was safe (in that workload) to set work_mem.  
So, I wrote a quick patch that logged both the open and close of the temp file.

Since the final size of the file isn't known at the time that the file is 
created, the patch just logs the filename.  The particular file can be 
correlated with the size by the name when the close message is logged.  Of 
course, there's no information about the pattern of the file size over time, 
but it's a bit more information than was there before.

As we don't know the size of the file until close time, the open is only logged 
if log_temp_files is 0 (the idea being that's "maximum logging").

If this sounds like something worthwhile in general, I can package it up as a 
proper patch.
--
-- Christophe Pettus
   x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers