### Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

```On Tue, Aug 19, 2014 at 2:08 AM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-18 13:06:15 -0400, Robert Haas wrote:
On Mon, Aug 18, 2014 at 7:19 AM, Rahila Syed rahilasye...@gmail.com wrote:
According to the measurement result, the amount of WAL generated in
Multiple Blocks in one run than that in Single Block in one run.
So ISTM that compression of multiple blocks at one run can improve
the compression ratio. Am I missing something?

Sorry for using unclear terminology. WAL generated here means WAL that gets
generated in each run without compression.
So, the value WAL generated in the  above measurement is uncompressed WAL
generated to be specific.
uncompressed WAL = compressed WAL  + Bytes saved.

Here, the measurements are done for a constant amount of time rather than
fixed number of transactions. Hence amount of WAL generated does not
correspond to compression ratios of each algo. Hence have calculated bytes
saved in order to get accurate idea of the amount of compression in each
scenario and for various algorithms.

Compression ratio i.e Uncompressed WAL/compressed WAL in each of the above
scenarios are as follows:

Compression algo   Multiple Blocks in one runSingle Block in one
run

LZ4  1.21
1.27

Snappy1.19   1.25

pglz 1.14
1.16

This shows compression ratios of both the scenarios Multiple blocks and
single block  are nearly same for this benchmark.

I don't agree with that conclusion.  The difference between 1.21 and
1.27, or between 1.19 and 1.25, is quite significant.  Even the
difference beyond 1.14 and 1.16 is not trivial.  We should try to get
the larger benefit, if it is possible to do so without an unreasonable
effort.

Agreed.

One more question: Do I see it right that multiple blocks compressed
together compress *worse* than compressing individual blocks? If so, I
have a rather hard time believing that the patch is sane.

Or the way of benchmark might have some problems.

Rahila,
I'd like to measure the compression ratio in both multiple blocks and
single block cases.
Could you tell me where the patch for single block in one run is?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] 9.5: Memory-bounded HashAgg

```On Fri, 2014-08-15 at 13:53 -0400, Robert Haas wrote:
I think that's right, and I rather like your (Jeff's) approach.  It's
definitely true that we could do better if we have a mechanism for
serializing and deserializing group states, but (1) I think an awful
lot of cases would get an awful lot better even just with the approach
proposed here and (2) I doubt we would make the
serialization/deserialization interfaces mandatory, so even if we had
that we'd probably want a fallback strategy anyway.

Thank you for taking a look.

To solve the problem for array_agg, that would open up two potentially
lengthy discussions:

1. Trying to support non-serialized representations (like
ArrayBuildState for array_agg) as a real type rather than using
internal.

2. What changes should we make to the aggregate API? As long as we're
changing/extending it, should we go the whole way and support partial
aggregation[1] (particularly useful for parallelism)?

Both of those discussions are worth having, and perhaps they can happen
in parallel as I wrap up this patch.

I'll see whether I can get consensus that my approach is (potentially)
commit-worthy, and your statement that it (potentially) solves a real
problem is a big help.

Regards,
Jeff Davis

[1]
http://blogs.msdn.com/b/craigfr/archive/2008/01/18/partial-aggregation.aspx

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WAL format and API changes (9.5)

```On Mon, Aug 18, 2014 at 10:55 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
All rightey.

Here's the next version of this work. It now comes as two patches. The
first
one refactors adds the XLogReplayBuffer() function and refactors all the
redo functions to use it. It doesn't change the WAL record format in any
way. The second patch applies over the first one, and changes the WAL
format, and all the WAL-creating functions to use the new API for
constructing WAL records. The second patch removes the relfilenode and
block
number arguments from XLogReplayBuffer, because they're no longer needed
when that information is included in the record format.

Todo:

to
WAL insertion?
* Compare WAL record sizes before and after. I've tried to keep it as
compact as possible, but I know that some records have gotten bigger. Need
to do a more thorough analysis.
* Rename XLogReplayBuffer. I don't particularly like any of the
suggestions
so far, XLogReplayBuffer included. Discussion continues..
* Anything else?

Patch 1 looks good. The main difference with the v1 submitted a couple of
days back is that the global variable approach is replaced with additional
arguments for the LSN position and record pointer in XLogReplayBuffer. I
have as well run a couple of tests with the page comparison tool, done some
tests based on installcheck-world with a slave replaying WAL behind, and
found no issues with it.
Perhaps we could consider pushing it to facilitate the next work? Even if
the second patch is dropped it is still a win IMO to have backup block
replay managed within a single function (being it named XLogReplayBuffer in
latest patch), and having it return a status flag.

Regarding patch 2:
- The main differences with the latest version are the modifications for
XLogReplayBuffer having new arguments (LSN position and record pointer).
XLogRecGetBlockRefIds has been changed to return a palloc'd array of block
IDs. xloginsert.h, containing all the functions for xlog record
construction is introduced as well.
- Tiny thing, be aware of tab padding. Here is heapam.c:
page = BufferGetPage(buffer);
PageSetAllVisible(page);
MarkBufferDirty(buffer);
- XLogRecGetBlockRefIds is not described in
src/backend/access/transam/README. Btw, pg_xlogdump drops a core dump when
using it:
--Output:
Assertion failed: (n == *num_refs), function XLogRecGetBlockRefIds, file
rmgr: Heaplen (rec/tot): 14/ 12912, tx:  3, lsn:
0/01000148, prev 0/01000120, Abort trap: 6 (core dumped)
-- Backtrace:
frame #4: 0x000103870363
pg_xlogdump`XLogRecGetBlockRefIds(record=0x7ff38a003200,
frame #5: 0x00010386d610
pg_xlogdump`XLogDumpDisplayRecord(config=0x7fff5c3945c8,
ReadRecPtr=16777544, record=0x7ff38a003200) + 272 at pg_xlogdump.c:357
argv=0x7fff5c394658) + 3160 at pg_xlogdump.c:749
In order to reproduce that, simply run regression tests, followed by
pg_xlogdump on one of the WAL files generated.
- This patch includes some diffs from pg_receivexlog.c taken from 52bffe3.
- I have run installcheck-world and compared the size of the WAL generated
on HEAD and the patch (any hints to improve such analysis are of course
welcome)
name  |   start   |stop|   diff
+---++---
HEAD (8605bc7) | 0/16C6808 | 0/11A2C670 | 271998568
Patch 1+2  | 0/16D45D8 | 0/1267A4B0 | 284843736
(2 rows)
So that's a diff of more or less 13MB for this test set.

Looking forward for some performance numbers as well as more precise
comparison of WAL record length.
--
Michael

```

### Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

```On Thu, 2014-08-14 at 12:52 -0400, Robert Haas wrote:
It appears to me that the performance characteristics for this version
are not significantly different from version 1.  I have not looked at
the code.

While trying to reproduce your results, I noticed what might be around a
1% regression just from adding the 3 fields to MemoryContextData. If I
cut it down to adding just one field, the regression disappears.

The results are fairly noisy, so I could be chasing the wrong thing. But
one reason to believe it is that I pushed the size of MemoryContextData
above 64, which sounds like it might be an important threshold.

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

```On Sat, 2014-08-16 at 23:09 +0200, Tomas Vondra wrote:
But maybe the inheritance really is not necessary - maybe it would be
enough to track this per-context, and then just walk through the
contexts and collect this. Because my observation is that most of the
time is actually spent in walking through blocks and freelists.

That makes a lot of sense to me.

Another approach is to pass a flag to hash_create that tells it not to
create a subcontext. Then we don't need to traverse anything; we already
know which context we want to look at. Perhaps I was being too clever
with the idea of tracking space for an entire hierarchy.

Also, as I pointed out in my reply to Robert, adding too many fields to
MemoryContextData may be the cause of the regression. Your idea requires
only one field, which doesn't show the same regression in my tests.

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_receivexlog and replication slots

```On Mon, Aug 18, 2014 at 4:01 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
On Mon, Aug 18, 2014 at 3:48 PM, Fujii Masao masao.fu...@gmail.com wrote:
On Mon, Aug 18, 2014 at 2:38 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
- IDENTIFY_SYSTEM checks were incorrect (even in HEAD). The following
check was done but in 9.4 this command returns 4 fields:
(PQntuples(res) != 1 || PQnfields(res)  3)
That's not directly related to this patch, but making some corrections
is not going to hurt..

Good catch! I found that libpqwalreceiver.c, etc have the same problem.
It's better to fix this separately. Patch attached.
Patch looks good to me.

Okay, applied!

Once you push that I'll rebase the stuff on
this thread once again, that's going to conflict for sure. And now
IDENTIFY_SYSTEM and pg_basebackup as well...

Yep, that's possible. But since the patch needs to be back-patch to 9.4,
I didn't do the refactoring.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WAL format and API changes (9.5)

```
On 08/19/2014 11:07 AM, Michael Paquier wrote:

Patch 1 looks good. The main difference with the v1 submitted a couple of
days back is that the global variable approach is replaced with additional
arguments for the LSN position and record pointer in XLogReplayBuffer. I
have as well run a couple of tests with the page comparison tool, done some
tests based on installcheck-world with a slave replaying WAL behind, and
found no issues with it.

Thanks!

Perhaps we could consider pushing it to facilitate the next work? Even if
the second patch is dropped it is still a win IMO to have backup block
replay managed within a single function (being it named XLogReplayBuffer in
latest patch), and having it return a status flag.

Yeah, that was my plan.

Regarding the name, the following have been suggested so far:

XLogReplayBuffer
XLogRestoreBuffer
XLogRecoverBuffer

One more idea:

XLogRedoBuffer (would be like three first options above, but would match
that we call the functions that call this redo functions)

I think XLogReadBufferForReplay is the most descriptive. Andres and
Alvaro both suggested it - independently I believe - so that seems to
come out naturally. But maybe make it XLogReadBufferForRedo, since we
call the redo functions redo functions and not replay functions.

Yet another option is to just call it XLogReadBuffer, and rename the
existing XLogReadBuffer to something else. With the 2nd patch, there are
only a few callers of XLogReadBuffer left. But is it too deceitful if
it with a full-page image? Maybe it's OK..

Barring objections or better ideas, I'm leaning towards

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_receivexlog and replication slots

```On 2014-08-18 14:38:06 +0900, Michael Paquier wrote:
- IDENTIFY_SYSTEM checks were incorrect (even in HEAD). The following
check was done but in 9.4 this command returns 4 fields:
(PQntuples(res) != 1 || PQnfields(res)  3)

Which is correct. We don't want to error out in the case where 3 columns
are returned because that'd unnecessarily break compatibility with
9.4. Previously that check was != 3...

This isn't a bug.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_receivexlog and replication slots

```On Tue, Aug 19, 2014 at 5:52 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-18 14:38:06 +0900, Michael Paquier wrote:
- IDENTIFY_SYSTEM checks were incorrect (even in HEAD). The following
check was done but in 9.4 this command returns 4 fields:
(PQntuples(res) != 1 || PQnfields(res)  3)

Which is correct. We don't want to error out in the case where 3 columns
are returned because that'd unnecessarily break compatibility with
9.4. Previously that check was != 3...

This isn't a bug.

Okay, I understood why you didn't update those codes.

Since we don't allow replication between different major versions,
it's better to apply this change at least into libpqwalreceiver.c. Thought?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_receivexlog and replication slots

```On 2014-08-19 18:02:32 +0900, Fujii Masao wrote:
On Tue, Aug 19, 2014 at 5:52 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-18 14:38:06 +0900, Michael Paquier wrote:
- IDENTIFY_SYSTEM checks were incorrect (even in HEAD). The following
check was done but in 9.4 this command returns 4 fields:
(PQntuples(res) != 1 || PQnfields(res)  3)

Which is correct. We don't want to error out in the case where 3 columns
are returned because that'd unnecessarily break compatibility with
9.4. Previously that check was != 3...

This isn't a bug.

Okay, I understood why you didn't update those codes.

Since we don't allow replication between different major versions,
it's better to apply this change at least into libpqwalreceiver.c. Thought?

We'd discussed that we'd rather keep it consistent. It also results in a
more explanatory error message lateron.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_receivexlog and replication slots

```On Tue, Aug 19, 2014 at 6:03 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-19 18:02:32 +0900, Fujii Masao wrote:
On Tue, Aug 19, 2014 at 5:52 PM, Andres Freund and...@2ndquadrant.com
wrote:
On 2014-08-18 14:38:06 +0900, Michael Paquier wrote:
- IDENTIFY_SYSTEM checks were incorrect (even in HEAD). The following
check was done but in 9.4 this command returns 4 fields:
(PQntuples(res) != 1 || PQnfields(res)  3)

Which is correct. We don't want to error out in the case where 3 columns
are returned because that'd unnecessarily break compatibility with
9.4. Previously that check was != 3...

This isn't a bug.

Okay, I understood why you didn't update those codes.

Since we don't allow replication between different major versions,
it's better to apply this change at least into libpqwalreceiver.c. Thought?

We'd discussed that we'd rather keep it consistent. It also results in a
more explanatory error message lateron.

Hmm... okay, will revert the commit.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

```Hello,

Could you tell me where the patch for single block in one run is?
Please find attached patch for single block compression in one run.

Thank you,

On Tue, Aug 19, 2014 at 1:17 PM, Fujii Masao masao.fu...@gmail.com wrote:

On Tue, Aug 19, 2014 at 2:08 AM, Andres Freund and...@2ndquadrant.com
wrote:
On 2014-08-18 13:06:15 -0400, Robert Haas wrote:
On Mon, Aug 18, 2014 at 7:19 AM, Rahila Syed rahilasye...@gmail.com
wrote:
According to the measurement result, the amount of WAL generated in
Multiple Blocks in one run than that in Single Block in one run.
So ISTM that compression of multiple blocks at one run can improve
the compression ratio. Am I missing something?

Sorry for using unclear terminology. WAL generated here means WAL
that gets
generated in each run without compression.
So, the value WAL generated in the  above measurement is uncompressed
WAL
generated to be specific.
uncompressed WAL = compressed WAL  + Bytes saved.

Here, the measurements are done for a constant amount of time rather
than
fixed number of transactions. Hence amount of WAL generated does not
correspond to compression ratios of each algo. Hence have calculated
bytes
saved in order to get accurate idea of the amount of compression in
each
scenario and for various algorithms.

Compression ratio i.e Uncompressed WAL/compressed WAL in each of the
above
scenarios are as follows:

Compression algo   Multiple Blocks in one runSingle Block in
one run

LZ4  1.21
1.27

Snappy1.19
1.25

pglz 1.14
1.16

This shows compression ratios of both the scenarios Multiple blocks
and
single block  are nearly same for this benchmark.

I don't agree with that conclusion.  The difference between 1.21 and
1.27, or between 1.19 and 1.25, is quite significant.  Even the
difference beyond 1.14 and 1.16 is not trivial.  We should try to get
the larger benefit, if it is possible to do so without an unreasonable
effort.

Agreed.

One more question: Do I see it right that multiple blocks compressed
together compress *worse* than compressing individual blocks? If so, I
have a rather hard time believing that the patch is sane.

Or the way of benchmark might have some problems.

Rahila,
I'd like to measure the compression ratio in both multiple blocks and
single block cases.
Could you tell me where the patch for single block in one run is?

Regards,

--
Fujii Masao

CompressSingleBlock.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PoC: Partial sort

```On Tue, Feb 11, 2014 at 7:59 AM, Alexander Korotkov aekorot...@gmail.com
wrote:

Done. Patch is splitted.

I've started to look at this, and for now I'm still finding my way around
the patch, so I'm not quite there yet with understanding everything.
Never-the-less it seems best to post my comments early, so as to help
maintain concurrency between the review and getting the patch into shape.

I've only been looking at partial-sort-basic-1.patch so far;

The patch no longer applies to master, but this was only due to a tab being
replaced by 2 spaces in a pgident run. I've attached an updated patch which
currently applies without any issues.

Here's a few notes from reading over the code:

* pathkeys.c

EquivalenceMember *member = (EquivalenceMember *)

occurs in costsize.c

* pathkeys.c

The following fragment:

n = pathkeys_common(root-query_pathkeys, pathkeys);

if (n != 0)
{
/* It's useful ... or at least the first N keys are */
return n;
}

return 0; /* path ordering not useful */
}

/* return the number of path keys in common, or 0 if there are none */
return pathkeys_common(root-query_pathkeys, pathkeys);

* execnodes.h

In struct SortState, some new fields don't have a comment.

I've also thrown a few different workloads at the patch and I'm very
impressed with most of the results. Especially when LIMIT is used, however
I've found a regression case which I thought I should highlight, but for
now I can't quite see what could be done to fix it.

create table a (x int not null, y int not null);
insert into a select x.x,y.y from generate_series(1,100) x(x) cross
join generate_series(1,10) y(y);

Patched:
explain analyze select x,y from a where x+0=1 order by x,y limit 10;
QUERY PLAN

Limit  (cost=92.42..163.21 rows=10 width=8) (actual
time=6239.426..6239.429 rows=10 loops=1)
-  Partial sort  (cost=92.42..354064.37 rows=5 width=8) (actual
time=6239.406..6239.407 rows=10 loops=1)
Sort Key: x, y
Presorted Key: x
Sort Method: quicksort  Memory: 25kB
-  Index Scan using a_x_idx on a  (cost=0.44..353939.13
rows=5 width=8) (actual time=0.059..6239.319 rows=10 loops=1)
Filter: ((x + 0) = 1)
Rows Removed by Filter: 990
Planning time: 0.212 ms
Execution time: 6239.505 ms
(10 rows)

Time: 6241.220 ms

Unpatched:
explain analyze select x,y from a where x+0=1 order by x,y limit 10;
QUERY PLAN

Limit  (cost=195328.26..195328.28 rows=10 width=8) (actual
time=3077.759..3077.761 rows=10 loops=1)
-  Sort  (cost=195328.26..195453.26 rows=5 width=8) (actual
time=3077.757..3077.757 rows=10 loops=1)
Sort Key: x, y
Sort Method: quicksort  Memory: 25kB
-  Seq Scan on a  (cost=0.00..194247.77 rows=5 width=8)
(actual time=0.018..3077.705 rows=10 loops=1)
Filter: ((x + 0) = 1)
Rows Removed by Filter: 990
Planning time: 0.510 ms
Execution time: 3077.837 ms
(9 rows)

Time: 3080.201 ms

As you can see, the patched version performs an index scan in order to get
the partially sorted results, but it does end up quite a bit slower than
the seqscan/sort that the unpatched master performs. I'm not quite sure how
realistic the x+0 = 1 WHERE clause is, but perhaps the same would happen if
something like x+y = 1 was performed too After a bit more analysis on
this, I see that if I change the 50k estimate to 10 in the debugger that
the num_groups is properly estimated at 1 and it then performs the seq scan
instead. So it looks like the costings of the patch are not to blame here.
(The 50k row estimate comes from rel tuples  / DEFAULT_NUM_DISTINCT)

That's all I have at the moment... More to follow soon.

Regards

David Rowley

partial-sort-basic-1_rebased.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On Tue, Aug 19, 2014 at 1:21 AM, Craig Ringer cr...@2ndquadrant.com wrote:
There's plenty of agreement on not a GUC - but what about alternatives?

It could be a new protocol message. Currently there are no transaction
oriented protocol messages (other than the transaction status in
ReadyForQuery). But would it not make sense to have TransactionBegin,
TransactionCommit, and TransactionAbort in the protocol? Would that
make it easier for the client-side failover to keep track of what
transactions are pending or committed and need to be verified after a
failover?

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On 2014-08-19 08:21:10 +0800, Craig Ringer wrote:
On 08/19/2014 01:03 AM, Robert Haas wrote:
2. I agree that it's not good to have this get controlled by a GUC.
If the behavior change is big enough that it's going to break clients,
adding a GUC isn't a sufficient remedy.  If it's not, adding a GUC is
unnecessary.

There's plenty of agreement on not a GUC - but what about alternatives?

What's the problem with the COMMIT WITH (report_lsn on) I've proposed?
Reporting the LSN in the command tag? Anything doing transparent
failover needs to be aware of transaction boundaries anyway.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] After switching primary server while using replication slot.

```On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko sawada.m...@gmail.com wrote:
Hi all,

After switching primary serer while using repliaction slot, the
standby server will not able to connect new primary server.
Imagine this situation, if primary server has two ASYNC standby
servers, also use each replication slots.
And the one standby(A) apply WAL without problems. But another one
standby(B) has stopped after connected to primary server.
(or sending WAL is too delayed)

In this situation, the standby(B) has not received WAL segment file
while stopping itself.
And the primary server can not remove WAL segments which has not been
Therefore the primary server have to keep the WAL segment file which
has not been received to all standby.
But standby(A) can do checkpoint itself, and then it's possible to
recycle WAL segments.
The number of WAL segment of each server are different.
( The number of WAL files of standby(A) having smaller than primary server.)
After the primary server is crashed, the standby(A) promote to primary,
we can try to connect standby(B) to standby(A) as new standby server.
But it will be failed because the standby(A) server might not have WAL
segment files that standby(B) required.

This sounds valid concern.

To resolve this situation, I think that we should make master server
to notify about removal of WAL segment to all standby servers.
And the standby servers recycle WAL segments files base on that information.

Thought?

How does the server recycle WAL files after it's promoted from the
standby to master?
It does that as it likes? If yes, your approach would not be enough.

The approach prevents unexpected removal of WAL files while the standby
is running. But after the standby is promoted to master, it might recycle
needed WAL files immediately. So another standby may still fail to retrieve
the required WAL file after the promotion.

ISTM that, in order to address this, we might need to log all the replication
slot activities and replicate them to the standby. I'm not sure if this
breaks the design of replication slot at all, though.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] 9.5: Memory-bounded HashAgg

```On 19 Srpen 2014, 9:52, Jeff Davis wrote:
On Fri, 2014-08-15 at 13:53 -0400, Robert Haas wrote:
I think that's right, and I rather like your (Jeff's) approach.  It's
definitely true that we could do better if we have a mechanism for
serializing and deserializing group states, but (1) I think an awful
lot of cases would get an awful lot better even just with the approach
proposed here and (2) I doubt we would make the
serialization/deserialization interfaces mandatory, so even if we had
that we'd probably want a fallback strategy anyway.

Thank you for taking a look.

To solve the problem for array_agg, that would open up two potentially
lengthy discussions:

1. Trying to support non-serialized representations (like
ArrayBuildState for array_agg) as a real type rather than using
internal.

That's certainly an option, and it's quite straightforward. The downside
of it is that you either prevent the aggregates from using the most
efficient state form (e.g. the array_agg might use a simple array as a
state) or you cause a proliferation of types with no other purpose.

2. What changes should we make to the aggregate API? As long as we're
changing/extending it, should we go the whole way and support partial
aggregation[1] (particularly useful for parallelism)?

Maybe, but not in this patch please. That's far wider scope, and while
considering it when designing API changes is probably a good idea, we
should resist the attempt to do those two things in the same patch.

Both of those discussions are worth having, and perhaps they can happen
in parallel as I wrap up this patch.

Exactly.

I'll see whether I can get consensus that my approach is (potentially)
commit-worthy, and your statement that it (potentially) solves a real
problem is a big help.

IMHO it's a step in the right direction. It may not go as far as I'd like,
but that's OK.

regards
Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

```On 19 Srpen 2014, 10:26, Jeff Davis wrote:
On Sat, 2014-08-16 at 23:09 +0200, Tomas Vondra wrote:
But maybe the inheritance really is not necessary - maybe it would be
enough to track this per-context, and then just walk through the
contexts and collect this. Because my observation is that most of the
time is actually spent in walking through blocks and freelists.

That makes a lot of sense to me.

Another approach is to pass a flag to hash_create that tells it not to
create a subcontext. Then we don't need to traverse anything; we already
know which context we want to look at. Perhaps I was being too clever
with the idea of tracking space for an entire hierarchy.

Also, as I pointed out in my reply to Robert, adding too many fields to
MemoryContextData may be the cause of the regression. Your idea requires
only one field, which doesn't show the same regression in my tests.

Yeah, keeping the structure size below 64B seems like a good idea.

The use-case for this is tracking a chosen subtree of contexts - e.g.
aggcontext and below, so I'd expect the tracked subtrees to be relatively
shallow. Am I right?

My fear is that by removing the inheritance bit, we'll hurt cases with a
lot of child contexts. For example, array_agg currently creates a separate
context for each group - what happens if you have 100k groups and do
MemoryContextGetAllocated? I guess iterating over 100k groups is not free.

Wouldn't the solution with inheritance and propagating the accounting info
to the parent actually better? Or maybe allowing both, having two flags
when creating a context instead of one?

AllocSetCreateTracked( , bool track, bool propagate_immediately)

By squashing both flags into a single mask you wouldn't increase the size.
Also, do we really need to track allocated bytes - couldn't we track
kilobytes or something and use smaller data types to get below the 64B?

regards
Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

```On Wed, Aug 13, 2014 at 4:01 PM, Dilip kumar dilip.ku...@huawei.com wrote:
On 11 August 2014 10:29, Amit kapila wrote,
5.

res = executeQuery(conn,

select relname, nspname from pg_class c, pg_namespace ns

where (relkind = \'r\' or relkind = \'m\')

and c.relnamespace = ns.oid order by relpages desc,

progname, echo);

a. Here you need to use SQL keywords in capital letters, refer one
of the other caller of executeQuery() in vacuumdb.c

b. Why do you need this condition c.relnamespace = ns.oid in above

query?

IT IS POSSIBLE THAT, TWO NAMESPACE HAVE THE SAME TABLE NAME, SO WHEN WE
ARE SENDING COMMAND FROM CLIENT WE NEED TO GIVE NAMESPACE WISE BECAUSE WE
NEED TO VACUUM ALL THE

TABLES.. (OTHERWISE TWO TABLE WITH SAME NAME FROM DIFFERENT NAMESPACE
WILL BE TREATED SAME.)

Thats right, however writing query in below way might
make it more understandable
+ SELECT relname, nspname FROM pg_class c, pg_namespace ns

SELECT c.relname, ns.nspname FROM pg_class c, pg_namespace ns

7.

Here we are getting message string, I think if we need to find the error
code then we need to parse the string, and after that we need to compare
with error codes.

Is there any other way to do this ?

You can compare against SQLSTATE by using below API.
val = PQresultErrorField(res, PG_DIAG_SQLSTATE);

You need to handle *42P01* SQLSTATE, also please refer below
usage where we are checking SQLSTATE.

fe-connect.c
PQresultErrorField(conn-result, PG_DIAG_SQLSTATE),

1.
* If user has not given the vacuum of complete db, then if

I think here you have said reverse of what code is doing.
You don't need *not* in above sentence.

2.
+ appendPQExpBuffer(sql, \%s\.\%s\, nspace, relName);
I think here you need to use function fmtQualifiedId() or fmtId()
or something similar to handle quotes appropriately.

3.

+  */
+ if (!r  !completedb)
Here the usage of completedb variable is reversed which means
that it goes into error path when actually whole database is
getting vacuumed and the reason is that you are setting it
to false in below code:
+ /* Vaccuming full database*/
+ vacuum_tables = false;

4.
Functions prepare_command() and vacuum_one_database() contain
duplicate code, is there any problem in using prepare_command()
function in vacuum_one_database(). Another point in this context
is that I think it is better to name function prepare_command()
as append_vacuum_options() or something on that lines, also it will
be better if you can write function header for this function as well.

5.
+ if (error)
+ {
+ for (i = 0; i  max_slot; i++)
+ {
+ DisconnectDatabase(connSlot[i]);
+ }

Here why do we need DisconnectDatabase() type of function?
Why can't we simply call PQfinish() as in base code?

6.
+ /*
+  * if table list is not provided then we need to do vaccum for whole DB
+  * get the list of all tables and prpare the list
+  */
spelling of prepare is wrong. I have noticed spell mistake

7. I think in new mechanism cancel handler will not work.
In single connection vacuum it was always set/reset
in function executeMaintenanceCommand(). You might need
to set/reset it in function run_parallel_vacuum().

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On Tue, Aug 19, 2014 at 7:21 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-19 08:21:10 +0800, Craig Ringer wrote:
On 08/19/2014 01:03 AM, Robert Haas wrote:
2. I agree that it's not good to have this get controlled by a GUC.
If the behavior change is big enough that it's going to break clients,
adding a GUC isn't a sufficient remedy.  If it's not, adding a GUC is
unnecessary.

There's plenty of agreement on not a GUC - but what about alternatives?

What's the problem with the COMMIT WITH (report_lsn on) I've proposed?
Reporting the LSN in the command tag? Anything doing transparent
failover needs to be aware of transaction boundaries anyway.

So something like transparent failover doesn't work when a client is
working in auto commit mode? That sounds not good.

Just idea. What about using NoticeResponse message to report LSN?
It can be sent basically anytime and this idea doesn't break current
wire protocol.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

```On Fri, Aug 15, 2014 at 12:55 AM, Robert Haas robertmh...@gmail.com wrote:

On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
1.
+Number of parallel connections to perform the operation. This
option will enable the vacuum
+operation to run on parallel connections, at a time one table
will
be operated on one connection.

a. How about describing w.r.t asynchronous connections

I don't think asynchronous is a good choice of word.

Agreed.

Maybe simultaneous?

Not sure. How about *concurrent* or *multiple*?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

```

### [HACKERS] Trove with PostgreSQL-XC

```Hi All,

Please let me know is that possible to use Openstack Trove with Postgres-XC.
With instances and Baremetal (after Juno Release).
I Know it is possible to use other medium like MySQL or PostgreSQL, but i

--
ViVek Raghuwanshi
Mobile -+91-09595950504
Skype - vivek_raghuwanshi
IRC vivekraghuwanshi
http://vivekraghuwanshi.wordpress.com/

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On 2014-08-19 19:59:51 +0900, Fujii Masao wrote:
On Tue, Aug 19, 2014 at 7:21 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-19 08:21:10 +0800, Craig Ringer wrote:
On 08/19/2014 01:03 AM, Robert Haas wrote:
2. I agree that it's not good to have this get controlled by a GUC.
If the behavior change is big enough that it's going to break clients,
adding a GUC isn't a sufficient remedy.  If it's not, adding a GUC is
unnecessary.

There's plenty of agreement on not a GUC - but what about alternatives?

What's the problem with the COMMIT WITH (report_lsn on) I've proposed?
Reporting the LSN in the command tag? Anything doing transparent
failover needs to be aware of transaction boundaries anyway.

So something like transparent failover doesn't work when a client is
working in auto commit mode? That sounds not good.

I don't think transparent failover + autocommit is a sensible
combination.

Just idea. What about using NoticeResponse message to report LSN?
It can be sent basically anytime and this idea doesn't break current
wire protocol.

I think that'd be horrible from multiple perspectives: a) how to discern
them from regular notice messages b) It's not sent in the same protocol
level message as the COMMIT message. Thus there's scenarios where you
only have the commit, but not the LSN.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [Postgres-xc-developers] Trove with PostgreSQL-XC

```XC’s libpq is binary compatible with PG.   So as long as Openstack uses libpq
to connect to PG, XC should work too.

Appreciate if you have a chance to try.

Thanks.
---
Koichi Suzuki

2014/08/19 20:14、Vivek Singh Raghuwanshi
vivekraghuwan...@gmail.commailto:vivekraghuwan...@gmail.com のメール：

Hi All,

Please let me know is that possible to use Openstack Trove with Postgres-XC.
With instances and Baremetal (after Juno Release).
I Know it is possible to use other medium like MySQL or PostgreSQL, but i am

--
ViVek Raghuwanshi
Mobile -+91-09595950504
Skype - vivek_raghuwanshi
IRC vivekraghuwanshi
http://vivekraghuwanshi.wordpress.com/
--
___
Postgres-xc-developers mailing list
postgres-xc-develop...@lists.sourceforge.netmailto:postgres-xc-develop...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers

```

### Re: [HACKERS] Enable WAL archiving even in standby

```On Fri, Aug 15, 2014 at 4:30 AM, Robert Haas robertmh...@gmail.com wrote:
On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao masao.fu...@gmail.com wrote:
I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds always as the valid value of archive_mode.
If it's set to always, the archiver is started when the server is in
standby
mode and all the WAL files that walreceiver wrote to the disk are archived by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the meanings
of the setting values on and off of archive_mode.

I like the feature, but I don't much like this as a control mechanism.
Having archive_command and standby_archive_command, as you propose
further down, seems saner.

Okay, that's fine. One question is; Which WAL files should be archived by
standby_archive_command? There are following kinds of WAL files.

(1) WAL files which were fully written and closed by walreceiver
Curently they are not archived at all.

(2) WAL file which is being written by walreceiver
This file will be closed before it's fully written because of,
for example, standby promotion.
Currently this is archived by archive_command.

(3) WAL file with new timeline, which is copied from (2)
At the end of recovery, after new timeline is assigned,
this latest WAL file with new timeline is created by being copied
from (2) (i.e., latest WAL file with old timeline). WAL data of
end-of-recovery checkpoint is written to this latest WAL file.
Currently this is archived by archive_command.

(4) Timeline history files
When standby is promoted to the master, the imeline is incremented
and the timeline history file is created.
Currently the timeline history files are archived by archive_command.

(5) WAL files generated in normal processing mode
Currently they are archived by archive_command.

I'm thinking to use standby_archive_command only for (1) because
the others are currently archived by archive_command. That means
that even if there are type (1) WAL files which have not been archived
yet after the standby promotion (i.e., the situation where WAL archiving
was delayed for some reasons in the standby), they are archived by
standby_archive_command. IOW, the archiver uses both archive commands
as the situation demands.

OTOH, maybe there are people who want to use standby_archive_command
for all the WAL files with old timeline, i.e., (1) and (2). Thought?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] dynahash vs. memory-bounded HashAggregate (hashjoin-style)

```Hi all,

while working on a prototype of memory-bounded hash aggregate (alternative
to Jeff's patch discussed in [1]), I ran into difficulties when dealing
with dynahash. So I'm asking for help ...

Some of the difficulties stem from my limited knowledge of dynahash, and
complexity in execGrouping.c, but it seems to me some are actually due to
a mismatch with the proposed hashjoin-like approach to batching.

The hashjoin-like batching essentially does this:

1) put data (in this case aggregate states) into a hash table, until a
memory limit is reached
2) double the number of batches and move half the entries from the hash
table (based on batch number, computed from the hash value)
3) repeat

This however requires two things, that I think seem quite difficult to do
with dynahash:

(a) detecting that a memory limit was reached

This is usually done with a condition like this:

if (consumed_memory  work_mem) {
... do the batching / remove ~50% of data from hash table
... it's expected counsumed_memory drops by 50%
}

Where consumed memory is the size of a memory context (cheap to
compute, thanks to another patch from Jeff [2]).

This however does not work with dynahash, because dynahash does not
release memory for removed tuples - it just moves it to a freelist, so
consumed_memory only grows.

For a while I thought I could do this:

if (consumed_memory  consumed_memory_prev) {
...
consumed_memory_prev = consumed_memory
}

But then I found out dynahash does not grow continuously, but in (quite
large) steps. Exceeding the limit a bit is not a big deal, but the
growth is quite fast and quickly leads to allocating much more than the
limit.

(b) removing tuples while batching

Whenever the number of batches is increased (doubled), I need to walk
through the hash table and remove entries not belonging to the current
batch (should be ~50% of them). The only way to do this with dynahash
sems to be iterating over the entries, and then doing another search
with HASH_REMOVE. Is there a better way?

I've been considering switching this to a custom hash table (similar to
what's used in hashjoin
https://commitfest.postgresql.org/action/patch_view?id=1494), which seems
not a big fan of replacing large amounts of code for no good reason.

Opinions?

I'd be grateful if someone more knowledgeable of dynahash / the way it's
used in execGrouping could review the prototype I have so far. There's
only a handful of functions related to dynahash, and most of the issues I
have is about passing the values properly (slots, dummy values, tuples).

regards
Toms

[1]
http://www.postgresql.org/message-id/1407706010.6623.16.camel@jeff-desktop

[2]
http://www.postgresql.org/message-id/1407012053.15301.53.camel@jeff-desktop

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [Postgres-xc-developers] Trove with PostgreSQL-XC

```Thanks,
One more question, is this library support multitenancy or we need to
launch separate VPC (virtual public cloud) every time for each customer.
its good if we have both option.

On Tue, Aug 19, 2014 at 5:04 PM, 鈴木 幸市 koi...@intellilink.co.jp wrote:

XC’s libpq is binary compatible with PG.   So as long as Openstack uses
libpq to connect to PG, XC should work too.

Appreciate if you have a chance to try.

Thanks.
---
Koichi Suzuki

2014/08/19 20:14、Vivek Singh Raghuwanshi vivekraghuwan...@gmail.com
のメール：

Hi All,

Please let me know is that possible to use Openstack Trove with
Postgres-XC.
With instances and Baremetal (after Juno Release).
I Know it is possible to use other medium like MySQL or PostgreSQL, but
i am not sure about XC.

--
ViVek Raghuwanshi
Mobile -+91-09595950504
Skype - vivek_raghuwanshi
IRC vivekraghuwanshi
http://vivekraghuwanshi.wordpress.com/

--
___
Postgres-xc-developers mailing list
postgres-xc-develop...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers

--
ViVek Raghuwanshi
Mobile -+91-09595950504
Skype - vivek_raghuwanshi
IRC vivekraghuwanshi
http://vivekraghuwanshi.wordpress.com/

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On Tue, Aug 19, 2014 at 8:22 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-19 19:59:51 +0900, Fujii Masao wrote:
On Tue, Aug 19, 2014 at 7:21 PM, Andres Freund and...@2ndquadrant.com
wrote:
On 2014-08-19 08:21:10 +0800, Craig Ringer wrote:
On 08/19/2014 01:03 AM, Robert Haas wrote:
2. I agree that it's not good to have this get controlled by a GUC.
If the behavior change is big enough that it's going to break clients,
adding a GUC isn't a sufficient remedy.  If it's not, adding a GUC is
unnecessary.

There's plenty of agreement on not a GUC - but what about alternatives?

What's the problem with the COMMIT WITH (report_lsn on) I've proposed?
Reporting the LSN in the command tag? Anything doing transparent
failover needs to be aware of transaction boundaries anyway.

So something like transparent failover doesn't work when a client is
working in auto commit mode? That sounds not good.

I don't think transparent failover + autocommit is a sensible
combination.

Just idea. What about using NoticeResponse message to report LSN?
It can be sent basically anytime and this idea doesn't break current
wire protocol.

I think that'd be horrible from multiple perspectives: a) how to discern
them from regular notice messages

You can implement your own protocol upon existing messages like
replication is done.

b) It's not sent in the same protocol
level message as the COMMIT message. Thus there's scenarios where you
only have the commit, but not the LSN.

Hmm.. you can change the code so that the message with LSN is sent
as soon as COMMIT message is sent, if required.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

```On Tue, Aug 19, 2014 at 9:52 AM,  furu...@pm.nttdata.co.jp wrote:
Thank you for updating the patch.

I did not get error with applying, and compiling.
It works fine. I think this function code has no problem.
Could you please submit patch to commit fest app?

Thanks for the review!

As you pointed out, submitted patch to commit fest app.

When replication slot is not specified in pg_receivexlog, the flush location
in the feedback message always indicates invalid. So there seems to be
no need to send the feedback as soon as fsync is issued, in that case.
How should this option work when replication slot is not specified?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

```
From: Fujii Masao masao.fu...@gmail.com
What's the main purpose of this tool? If it's for WAL archiving, the tool
name

pg_copy sounds too generic. We already have pg_archivecleanup, so maybe
pg_archivecopy or something is better for the consistency?

pg_copy in the patch copies the file to the destination in a
straightforward way,
i.e., directly copies the file to the dest file with actual name. This can
cause
the problem which some people reported. The problem is that, when the
server
crashes while WAL file is being archived by cp command, its
partially-filled
WAL file remains at the archival area. This half-baked archive file can
cause
various troubles. To address this, WAL file needs to be copied to the
temporary
file at first, then renamed to the actual name. I think that pg_copy
should

copy the WAL file in that way.

I intended to make pg_copy a straightforward replacement of cp/copy, which
complements the missing sync.  Direct I/O and posix_fadvice() feature may be
convenient but not essential for this utility.  cp/copy doesn't copy to a
temporary file, and the problem can be solved easily by mv/move.  I wanted
to keep pg_copy as generic as cp/copy, so that it can be used by some
advanced features in the future, e.g. comprehensive backup/recovery
management like RMAN (this example may not be best) when it's integrated
into the core.

With that said, copying to a temporary file like dest.tmp and renaming it
to dest sounds worthwhile even as a basic copy utility.  I want to avoid
copying to a temporary file with a fixed name like _copy.tmp, because some
advanced utility may want to run multiple instances of pg_copy to copy
several files into the same directory simultaneously.  However, I'm afraid
multiple dest.tmp files might continue to occupy disk space after
canceling copy or power failure in some use cases, where the copy of the
same file won't be retried.  That's also the reason why I chose to not use a
temporary file like cp/copy.

Currently pg_copy always syncs the archive file, and there is no way to
disable
that. But I'm sure that not everyone want to sync the archive file. So I
think

that it's better to add the option specifying whether to sync the file
or not, into
pg_copy.

pg_copy is for copying a file reliably by syncing.  If sync is not
necessary, people can use cp/copy.

Some users might want to specify whether to call posix_fadvise or not
because

they might need to re-read the archvied files just after the archiving.
For example, network copy of the archived files from the archive area to
remote site for disaster recovery.

This sounds reasonable.  Do you have an idea on the switch name and the
default behavior?

Do you recommend to use pg_copy for restore_command? If yes, it also
should
be documented. And in the WAL restore case, the restored WAL files are

soon by recovery, so posix_fadvise is not good in that case.

Direct I/O and posix_fadvise are used only for destination file. But why
not

source file? That might be useful especially for restore_command case.

No, I don't think it's necessary to use pg_copy for restore_command.

At last, the big question is, is there really no OS command which provides
the same functionality as pg_copy does? If there is, I'd like to avoid
duplicate

work basically.

If there exists such a command available in the standard OS installation, I
want to use it.

Regards
MauMau

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

```

Is this supposed to be session-local data, or is it visible from remote
sessions too?  How durable is it supposed to be?  Keep in mind that in
case of a crash, all pgstats data is erased.

I want it to be visible from other sessions.  I'm okay about the data
erasure during recovery.  We can probably extend pg_statsinfo to save the
new info for long-term trend analysis.  TBH, I want a feature like
pg_statsinfo in core.

Regards
MauMau

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On 08/19/2014 06:21 PM, Andres Freund wrote:
What's the problem with the COMMIT WITH (report_lsn on) I've proposed?
Reporting the LSN in the command tag? Anything doing transparent
failover needs to be aware of transaction boundaries anyway.

Tom's objection to a GUC applies there too - a client app can send that
when the underlying driver doesn't expect to get the results.

I'm not completely convinced that's a problem - oh dear, the app breaks.
The answer to so many other things in Pg is well, don't do that then
that I don't see this as overly different.

However, granting that it is a problem, the same objection to a GUC
applies to this too.
--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Reporting the commit LSN at commit time

```On 2014-08-19 21:47:26 +0800, Craig Ringer wrote:
On 08/19/2014 06:21 PM, Andres Freund wrote:
What's the problem with the COMMIT WITH (report_lsn on) I've proposed?
Reporting the LSN in the command tag? Anything doing transparent
failover needs to be aware of transaction boundaries anyway.

Tom's objection to a GUC applies there too - a client app can send that
when the underlying driver doesn't expect to get the results.

I don't really think this is true. With a GUC it's set for the whole
session or even users. With such a option to COMMIT it'd only set when
issued by something that actually does transparent failover (i.e. the
underlying driver).

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] New PostgreSQL buildfarm client release 4.14 - bug fix for MSVC

```There is a new release - version 4.14 - of the buildfarm client, now
available at

The only change of note is that a bug which only affects MSVC clients
(such that the client will not complete a run) and is present in
releases 4.12 and 4.13 is fixed. Clients on other platforms do not need

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Enable WAL archiving even in standby

```
From: Fujii Masao masao.fu...@gmail.com

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds always as the valid value of
archive_mode.
If it's set to always, the archiver is started when the server is in
standby
mode and all the WAL files that walreceiver wrote to the disk are archived
by

using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the
meanings

of the setting values on and off of archive_mode.

I think that this feature is useful for the case, e.g., where large
database

needs to be replicated between remote servers. Imagine the situation where
the replicated database gets corrupted completely in the remote standby.
How should we address this problematic situation and restart the standby?

One approach is to take a fresh backup from the master and restore it onto
the standby. But since the database is large and there is long distance
between two servers, this approach might take a surprisingly long time.

Another approach is to restore the backup which was taken from the standby
before. But most of many WAL files which the backup needs might exist only
in the master (because WAL archiving cannot be enabled in the standby) and
they need to be transfered from the master to the standby via
long-distance

network. So I think that this approach also would take a fairly long time.
To shorten that time, you may think that archive_command in the master can
be set so that it transfers WAL files from the master to the standby's
archival storage. I agree that this setting can accelerate the database
restore

process. But this causes every WAL files to be transfered between remote
servers twice (one is by streaming replication, another is by
archive_command),

and which is a waste of network bandwidth.

Great.  This is exactly what I hoped for disaster recovery, although I
haven't looked at the patch yet.

Back to the patch. If archive_mode is set to always, archive_command is
always used to archive WAL files even during recovery. Do we need to
separate

the command into two for master and standby, respectively? We can add
something like standby_archive_command parameter which is used to archive
only WAL files walreceiver writes. The other WAL files are archived by
archive_command. I'm not sure if it's really worth separating the command
that way. Is there any use case?

I don't see any reason to separate parameters.  I want the spec simple.

I've not included the update of document in the patch yet. If we agree to
support this feature, I will do the remaining work.

Could you consider adding a new section for disaster recovery that describes
concrete parameter settings (e.g. how do we discard old archive WAL files
after taking a base backup from standby, because backup label file is not
created?).  Good luck!

Regards
MauMau

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WAL format and API changes (9.5)

```Heikki Linnakangas wrote:

Barring objections or better ideas, I'm leaning towards

WFM

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WAL format and API changes (9.5)

```On 2014-08-19 10:33:29 -0400, Alvaro Herrera wrote:
Heikki Linnakangas wrote:

Barring objections or better ideas, I'm leaning towards

WFM

for me too. Although we could imo strip the 'XLog' in the beginning if
we want to make it shorter. The ForRedo is saying that pretty much.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
server_cert_valid: Did the server present a valid certificate?
yes or no

server_cert_matches_host: Does the Common Name of the certificate
match the host connected to? yes or no

Aren't these questions addressed by sslmode?

Exposing the SSL information as generic key/value pairs allows
adding more attributes in the future, without breaking the ABI, and
it also allows exposing implementation-specific information in a
generic way. The attributes listed above cover the needs of psql.
What else do we need?

At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

For all certificates:
(client, server, cert that signed each, any intermediate CAs, root CAs)
Certificate itself (perhaps in DER, PEM, X509 formats..)
Fingerprint
Signed-By info
Common Name
Organization (et al)
Alternate names
Issue date, expiration date
CRL info, OCSP info
Allowed usage (encryption, signing, etc)

CRL checking done?
OCSP used?

I think it would also be nice to get more information from the
server's certificate, like the hostname and the organization its
issued to, and expiration date, so that an interactive client like
pgAdmin or even psql could display that information like a web
browser does. Would it be best to add those as extra attributes in
the above list, perhaps with a server_cert_* prefix, or add a new
function for extracting server cert's attributes?

This really shouldn't be for *just* the server's certificate but rather
available for all certificates involved- on both sides.

The other question is: What do we do with PQgetssl()? We should
document it as deprecated, but we'll have to keep it around for the
foreseeable future for backwards-compatibility. We obviously cannot
return a valid OpenSSL struct when using any other implementation,
so I think it'll have to just return NULL when not using OpenSSL.
Probably the most common use of PQgetssl() is to just check if it
returns NULL or not, to determine if SSL is enabled, so a client
that does that would incorrectly think that SSL is not used, even
when it is. I think we can live with that.

That's not ideal, but the only other option I can think of offhand is to
break the existing API and force everyone to update and that seems
worse.

Have you looked at how this change will play out with the ODBC driver..?
Especially on Windows with the SSL library you're proposing we use
there..  I recall that at one point the ODBC driver simply used libpq to
handle the authentication and set everything up, and then switched to
talking directly without libpq.  In any case, it'd probably be good to
make sure the attributes you're suggesting are sufficient to meet the
needs of the ODBC driver too.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On 2014-08-19 10:48:41 -0400, Stephen Frost wrote:
Exposing the SSL information as generic key/value pairs allows
adding more attributes in the future, without breaking the ABI, and
it also allows exposing implementation-specific information in a
generic way. The attributes listed above cover the needs of psql.
What else do we need?

At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

For all certificates:
(client, server, cert that signed each, any intermediate CAs, root CAs)
Certificate itself (perhaps in DER, PEM, X509 formats..)
Fingerprint
Signed-By info
Common Name
Organization (et al)
Alternate names
Issue date, expiration date
CRL info, OCSP info
Allowed usage (encryption, signing, etc)

CRL checking done?
OCSP used?

I'm not really sure we need all that. We're not building a general ssl
library abstraction here. Presenting all those in a common and useful
format isn't trivial.

What I'm wondering is whether we should differentiate 'standard'
attributes that we require from ones that a library can supply
optionally. If we don't we'll have difficulty enlarging the 'standard'
set over time.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On Tue, Aug 19, 2014 at 4:48 PM, Stephen Frost sfr...@snowman.net wrote:
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
server_cert_valid: Did the server present a valid certificate?
yes or no

server_cert_matches_host: Does the Common Name of the certificate
match the host connected to? yes or no

Aren't these questions addressed by sslmode?

Not entirely. You can have sslmode=require and have a matching
certificate. You don't *have* to have sslmode=verify-full for that.

However, whether it makes *sense* without sslmode is another story -
but assuming you use something like kerberos for auth, it might. For

Exposing the SSL information as generic key/value pairs allows
adding more attributes in the future, without breaking the ABI, and
it also allows exposing implementation-specific information in a
generic way. The attributes listed above cover the needs of psql.
What else do we need?

At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

For all certificates:
(client, server, cert that signed each, any intermediate CAs, root CAs)
Certificate itself (perhaps in DER, PEM, X509 formats..)

Yeah, if we can extract it in PEM for example, that would be useful.

Fingerprint

Definitely.

Signed-By info

If we can get the full cert, do that one instead.

Common Name

Definitely.

Organization (et al)
Alternate names
Issue date, expiration date
CRL info, OCSP info
Allowed usage (encryption, signing, etc)

All those would also be covered by the certificate itself part I
think - they're not that common.

CRL checking done?
OCSP used?

I think it would also be nice to get more information from the
server's certificate, like the hostname and the organization its
issued to, and expiration date, so that an interactive client like
pgAdmin or even psql could display that information like a web
browser does. Would it be best to add those as extra attributes in
the above list, perhaps with a server_cert_* prefix, or add a new
function for extracting server cert's attributes?

This really shouldn't be for *just* the server's certificate but rather
available for all certificates involved- on both sides.

Well, if you are already the client, wouldn't you know your own certificate?

The other question is: What do we do with PQgetssl()? We should
document it as deprecated, but we'll have to keep it around for the
foreseeable future for backwards-compatibility. We obviously cannot
return a valid OpenSSL struct when using any other implementation,
so I think it'll have to just return NULL when not using OpenSSL.
Probably the most common use of PQgetssl() is to just check if it
returns NULL or not, to determine if SSL is enabled, so a client
that does that would incorrectly think that SSL is not used, even
when it is. I think we can live with that.

That's not ideal, but the only other option I can think of offhand is to
break the existing API and force everyone to update and that seems
worse.

Agreed.

If we just return an arbitrary pointer, then any application that
*did* actually try to use it would crash.

It's not ideal, but errorring in the way of not saying we're secure
when we are, is acceptable - unlike the opposite.

Of course, we need to publish it very clearly in the release notes,
and I would suggest backpatching into the documentation in old
versions etc as well.

Have you looked at how this change will play out with the ODBC driver..?
Especially on Windows with the SSL library you're proposing we use
there..  I recall that at one point the ODBC driver simply used libpq to
handle the authentication and set everything up, and then switched to
talking directly without libpq.  In any case, it'd probably be good to
make sure the attributes you're suggesting are sufficient to meet the
needs of the ODBC driver too.

+1.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Andres Freund (and...@2ndquadrant.com) wrote:
On 2014-08-19 10:48:41 -0400, Stephen Frost wrote:
At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

[...]

I'm not really sure we need all that. We're not building a general ssl
library abstraction here.

Really?  I'm pretty sure that's exactly what we're doing.  What I was
wondering is which one we should be modeling off of.

One thought I had was to look at what Apache's mod_ssl provides, which
can be seen here: http://httpd.apache.org/docs/2.2/mod/mod_ssl.html

I know that I've used quite a few of those.

Telling users they simply can't have this information isn't acceptable.
I'm not a huge fan of just passing back all of the certificates and
making the user extract out the information themselves, but if it comes
down to it then that's at least better than removing any ability to get
at that information.

What I'm wondering is whether we should differentiate 'standard'
attributes that we require from ones that a library can supply
optionally. If we don't we'll have difficulty enlarging the 'standard'
set over time.

If we end up not being able to provide everything for all of the
libraries we support then perhaps we can document which are available
from all of them, but I'd hope the list of only in X is pretty small.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

```Abhijit Menon-Sen wrote:
At 2014-06-29 20:35:04 +0900, maumau...@gmail.com wrote:

Thanks, I marked it as ready for committer.  I hope Fujii san or
another committer will commit this, refining English expression if
necessary.

Since it was just a matter of editing, I went through the patch and
corrected various minor errors (typos, awkwardness, etc.). I agree
that this is now ready for committer.

FWIW I think determines was correct.

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On Tue, Aug 19, 2014 at 5:05 PM, Stephen Frost sfr...@snowman.net wrote:
On 2014-08-19 10:48:41 -0400, Stephen Frost wrote:
At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

[...]

I'm not really sure we need all that. We're not building a general ssl
library abstraction here.

Really?  I'm pretty sure that's exactly what we're doing.  What I was
wondering is which one we should be modeling off of.

One thought I had was to look at what Apache's mod_ssl provides, which
can be seen here: http://httpd.apache.org/docs/2.2/mod/mod_ssl.html

I know that I've used quite a few of those.

Telling users they simply can't have this information isn't acceptable.
I'm not a huge fan of just passing back all of the certificates and
making the user extract out the information themselves, but if it comes
down to it then that's at least better than removing any ability to get
at that information.

Yeah, being able to provide most of them easily accessible is a good
thing. Otherwise, we just move the burden to deparse them to the
client which will then have to know which SSL library it's built
against, so every single client that wants to do something useful with
the cert would have to know about multiple implementations.

I think starting from the apache list is a very good idea.

We should then expose the same set of data at least through the
sslinfo server module.

What I'm wondering is whether we should differentiate 'standard'
attributes that we require from ones that a library can supply
optionally. If we don't we'll have difficulty enlarging the 'standard'
set over time.

If we end up not being able to provide everything for all of the
libraries we support then perhaps we can document which are available
from all of them, but I'd hope the list of only in X is pretty small.

+1. I bet the most common ones will be in all of them, because
frankly, it's functionality you just need to use SSL properly.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Magnus Hagander (mag...@hagander.net) wrote:
On Tue, Aug 19, 2014 at 4:48 PM, Stephen Frost sfr...@snowman.net wrote:
Aren't these questions addressed by sslmode?

Not entirely. You can have sslmode=require and have a matching
certificate. You don't *have* to have sslmode=verify-full for that.

However, whether it makes *sense* without sslmode is another story -
but assuming you use something like kerberos for auth, it might. For

Sure- I guess my point was really, if you're not verifying them by
sslmode=verify-full, do you really want to ask the question?  If you

What else do we need?

At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

For all certificates:
(client, server, cert that signed each, any intermediate CAs, root CAs)
Certificate itself (perhaps in DER, PEM, X509 formats..)

Yeah, if we can extract it in PEM for example, that would be useful.

Fingerprint

Definitely.

Signed-By info

If we can get the full cert, do that one instead.

Common Name

Definitely.

Organization (et al)
Alternate names
Issue date, expiration date
CRL info, OCSP info
Allowed usage (encryption, signing, etc)

All those would also be covered by the certificate itself part I
think - they're not that common.

Not sure I agree with that but what I don't really like is the
suggestion that we'll need to tell everyone who wants more detailed
information from the certificate to link in whatever their preferred SSL
library is and use that to decode the PEM cert to pull the info.  We'll
end up having applications linking in both OpenSSL and GNUTLS, for
example, which is pretty grotty, imv.

Serial is absolutely another one we need to include, as I look over at
what mod_ssl supports.  Really, I'd look at that list as our minimum to
support..

I think it would also be nice to get more information from the
server's certificate, like the hostname and the organization its
issued to, and expiration date, so that an interactive client like
pgAdmin or even psql could display that information like a web
browser does. Would it be best to add those as extra attributes in
the above list, perhaps with a server_cert_* prefix, or add a new
function for extracting server cert's attributes?

This really shouldn't be for *just* the server's certificate but rather
available for all certificates involved- on both sides.

Well, if you are already the client, wouldn't you know your own certificate?

Uh, no?  Not without having a library of your own which can open the
certificate file (after it figures out which one we decided to use- oh
yeah, we should probably include that information too..  and then we
have to make sure we can represent things like on a smart card) and
then parse and extract the information you want from it..

That's not ideal, but the only other option I can think of offhand is to
break the existing API and force everyone to update and that seems
worse.

Agreed.

If we just return an arbitrary pointer, then any application that
*did* actually try to use it would crash.

That wasn't what I was thinking but rather something like remove
PQgetssl and replace it with PQgetopenssl or something, breaking the
API completely, forcing everyone to make changes to compile against the
new library, etc, etc.  Very ugly but also very obvious.

It's not ideal, but errorring in the way of not saying we're secure
when we are, is acceptable - unlike the opposite.

Yeah, I tend to agree, though I don't particularly like it.  The options
are just so much worse. :/

Of course, we need to publish it very clearly in the release notes,
and I would suggest backpatching into the documentation in old
versions etc as well.

Sounds like a good idea to me.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```
On 08/19/2014 05:48 PM, Stephen Frost wrote:

* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:

server_cert_valid: Did the server present a valid certificate?
yes or no

server_cert_matches_host: Does the Common Name of the certificate
match the host connected to? yes or no

Aren't these questions addressed by sslmode?

Sort of. In sslmode=verify-ca, libpq checks that the server cert was
valid (the first attribute) and rejects the connection if not. In
verify-full mode, it also checks that the hostname matches (the second
attribute). But in sslmode=require, it's possible to connect to a server
with an invalid server cert. (to be precise in sslmode=require mode
libpq checks the server cert if a root CA cert was given, but if no root
CA cert is configured it will allow connecting anyway).

I think it would be nice to be able to query those attributes
explicitly, rather than just expect libpq to reject the connection if
something's wrong. For example, I'm thinking that an interactive client
might present an annoying pop-up window to the user if the server cert
is not valid, asking if he wants to connect anyway, and perhaps remember
the certificate and not ask again (TOFU).

We don't actually have such functionality today; you can query the
OpenSSL structs for those things, but the checks that libpq performs are
not exactly the same that OpenSSL does. We have our own function to
check if a wildcard cert matches a hostname, for example, and libpq
knows that host and hostaddr can be different. So this would
actually be a new feature, probably best to be implemented as a separate
patch. (I grabbed the idea for those attributes from Martijn's ancient
gnutls patch.)

Exposing the SSL information as generic key/value pairs allows
adding more attributes in the future, without breaking the ABI, and
it also allows exposing implementation-specific information in a
generic way. The attributes listed above cover the needs of psql.
What else do we need?

At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

For all certificates:
(client, server, cert that signed each, any intermediate CAs, root CAs)
Certificate itself (perhaps in DER, PEM, X509 formats..)
Fingerprint
Signed-By info
Common Name
Organization (et al)
Alternate names
Issue date, expiration date
CRL info, OCSP info
Allowed usage (encryption, signing, etc)

Hmm. That seems a bit too much. Perhaps provide just the certificate
itself in DER/PEM format, and have the client parse it (using OpenSSL or
something else) if it wants more details.

CRL checking done?

I guess, although you know implicitly that it was if the sslcrl option
was given.

OCSP used?

We don't support OCSP.

I think it would also be nice to get more information from the
server's certificate, like the hostname and the organization its
issued to, and expiration date, so that an interactive client like
pgAdmin or even psql could display that information like a web
browser does. Would it be best to add those as extra attributes in
the above list, perhaps with a server_cert_* prefix, or add a new
function for extracting server cert's attributes?

This really shouldn't be for *just* the server's certificate but rather
available for all certificates involved- on both sides.

Ok, but why? All the other stuff is readily available in the
configuration you use to connect. I guess it doesn't hurt to expose them
through this interface as well, but I can't immediately think of an
example that would use them.

Have you looked at how this change will play out with the ODBC driver..?
Especially on Windows with the SSL library you're proposing we use
there..  I recall that at one point the ODBC driver simply used libpq to
handle the authentication and set everything up, and then switched to
talking directly without libpq.  In any case, it'd probably be good to
make sure the attributes you're suggesting are sufficient to meet the
needs of the ODBC driver too.

Indeed, the ODBC driver only uses libpq for authentication, then calls
PQgetssl(), and takes over the whole show calling SSL_read() and
SSL_write() itself. Ideally, we'd modify psqlodbc to stop doing that,
but that's not an easy job. In the short-term, I think we need to export
pqsecure_read() and pqsecure_write() functions in libpq, so that the

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On 2014-08-19 11:05:07 -0400, Stephen Frost wrote:
On 2014-08-19 10:48:41 -0400, Stephen Frost wrote:
At first blush, I'd say a whole bunch..  Off the top of my head I can
think of:

[...]

I'm not really sure we need all that. We're not building a general ssl
library abstraction here.

Really?  I'm pretty sure that's exactly what we're doing.

No. We should build something that's suitable for postgres, not
something general. We'll fail otherwise. For anything fancy the user has
to look at the certificate themselves. We should make it easy to get at
the whole certificate chain in a consistent manner.

Telling users they simply can't have this information isn't
acceptable.

Meh. Why? Most of that isn't something a normal libpq user is going to
need.

What I'm wondering is whether we should differentiate 'standard'
attributes that we require from ones that a library can supply
optionally. If we don't we'll have difficulty enlarging the 'standard'
set over time.

If we end up not being able to provide everything for all of the
libraries we support then perhaps we can document which are available
from all of them, but I'd hope the list of only in X is pretty small.

I'm pretty sure that we can't build a reasonable list of the information
exposed by any library. Especially as we're likely going to need some
mapping to agree to map to the common names.

I'd just go for plain names for standard attributes and X-\$library- for library
specific stuff.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [Postgres-xc-developers] Trove with PostgreSQL-XC

```I replied to the XC-list (only) to ask them to discontinue cross-posting

Replying here just so you know.

On 08/19/2014 07:46 PM, Vivek Singh Raghuwanshi wrote:
Thanks,
One more question, is this library support multitenancy or we need to
launch separate VPC (virtual public cloud) every time for each customer.
its good if we have both option.

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Hokey wrong versions of libpq in apt.postgresql.org

```On 08/19/2014 05:02 AM, Joshua D. Drake wrote:

I am finally able to get back around to this and I am still calling:
Hokey. I just loaded up a fresh precise (I assume trusty will act the
same way) and installed postgresql. I installed it, without the PDGD
repository and everything worked perfectly. The only error I got when
using pgxnclient to install pg_repack was an error about not having
libedit-dev installed. I installed it, and it was perfect. I even tested
with create extension etc...

So... If we are supposed to ship the latest lib... how come Debian or
Ubuntu don't do that? They ship the latest lib for the version they are
shipping and because of that, everything works, as expected.

I iterate, the current apt.postgresql.org is not doing things correctly.
It breaks things and it shouldn't.

FWIW, this is inconsistent with what yum.postgresql.org does - it takes
the POLA approach of packing the libpq from the major release configured
in the repo. Each major has its own sub-repo.

I find it pretty hard to justify installing a 9.3 libpq alongside a 9.1
server myself.

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_dump refactor patch to remove global variables

```On 08/19/2014 01:40 AM, Robert Haas wrote:
Attached is a patch that doesn't add any new functionality or
features, all it does is get rid of the global variables that
pg_dump.c is full of.
I think this is an excellent idea.

It's also one small step toward library-ifying pg_dump.

Huge +1.

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
I think it would be nice to be able to query those attributes
explicitly, rather than just expect libpq to reject the connection
if something's wrong. For example, I'm thinking that an interactive
client might present an annoying pop-up window to the user if the
server cert is not valid, asking if he wants to connect anyway, and
perhaps remember the certificate and not ask again (TOFU).

Alright, I could see that being useful, though as you say, it'd really
be new functionality.

Hmm. That seems a bit too much. Perhaps provide just the certificate
itself in DER/PEM format, and have the client parse it (using
OpenSSL or something else) if it wants more details.

I really don't care for that approach.  Our SSL support has always been
horrible- I was hoping we'd actually improve that situation.  Adding
things in piecemeal over time will just be painful for our users and I
don't see why we should wait.

OCSP used?

We don't support OCSP.

Another thing that we really should address (actually- can't you enable
it in OpenSSL directly?  I seem to recall something along those lines
anyway, though it's been quite a few years now).

This really shouldn't be for *just* the server's certificate but rather
available for all certificates involved- on both sides.

Ok, but why? All the other stuff is readily available in the
configuration you use to connect. I guess it doesn't hurt to expose
them through this interface as well, but I can't immediately think
of an example that would use them.

For starters, certificates can be passed between the client and the
server to complete the chain, so I don't see how it's readily
available, not to mention that even if the location of the certs was in
simple files locally, the application would need to bring in their own
library to parse and extract out this information, which we've

Indeed, the ODBC driver only uses libpq for authentication, then
calls PQgetssl(), and takes over the whole show calling SSL_read()
and SSL_write() itself. Ideally, we'd modify psqlodbc to stop doing
that, but that's not an easy job. In the short-term, I think we need
to export pqsecure_read() and pqsecure_write() functions in libpq,
so that the ODBC driver can use those instead of SSL_read() and
SSL_write().

Yeah, that's what I remembered.  There was an attempt to make that
change at one point, but it was reverted due to the lack of batching
ability in libpq (without resorting to cursors, as I recall...),
requiring double the memory usage.  Still, if pqsecure_read and
pqsecure_write are sufficient to make the ODBC driver work, that's good
news.  I had been worried it did other things with the OpenSSL struct
beyond just using those.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Andres Freund (and...@2ndquadrant.com) wrote:
No. We should build something that's suitable for postgres, not
something general. We'll fail otherwise. For anything fancy the user has
to look at the certificate themselves. We should make it easy to get at
the whole certificate chain in a consistent manner.

I don't buy this argument at all.

Telling users they simply can't have this information isn't
acceptable.

Meh. Why? Most of that isn't something a normal libpq user is going to
need.

I'm not interested in SSL support for users who don't use or care about
SSL (which would be 'normal libpq users', really).  I've *long* been
frustrated by our poor support of SSL and at how painful it is to get
proper SSL working- and it's been a real problem getting PG to pass the
security compliance requirements because of that poor support.  Let's
stop the rhetoric that PG doesn't need anything but the most basic
SSL/auditing/security capabilities.

If we end up not being able to provide everything for all of the
libraries we support then perhaps we can document which are available
from all of them, but I'd hope the list of only in X is pretty small.

I'm pretty sure that we can't build a reasonable list of the information
exposed by any library. Especially as we're likely going to need some
mapping to agree to map to the common names.

Per Apache's documentation, mod_ssl and mod_gnutls support the same set
of environment variables (with the same names even), so I don't buy this
argument either.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On 2014-08-19 11:52:37 -0400, Stephen Frost wrote:
No. We should build something that's suitable for postgres, not
something general. We'll fail otherwise. For anything fancy the user has
to look at the certificate themselves. We should make it easy to get at
the whole certificate chain in a consistent manner.

I don't buy this argument at all.

Aha.

Telling users they simply can't have this information isn't
acceptable.

Meh. Why? Most of that isn't something a normal libpq user is going to
need.

I'm not interested in SSL support for users who don't use or care about
SSL (which would be 'normal libpq users', really).

That's the majority of our users. Even those that care about ssl care
about setting it up in a safe manner, won't care about most of the
attributes.

I have no problem to expand the list of attributes once we have a couple
of differing backends for the support, but having a long list of things
that need to be supported by every one just makes getting there harder.

I've *long* been
frustrated by our poor support of SSL and at how painful it is to get
proper SSL working- and it's been a real problem getting PG to pass the
security compliance requirements because of that poor support.  Let's
stop the rhetoric that PG doesn't need anything but the most basic
SSL/auditing/security capabilities.

I've no problem with keeping future extensions of the API in mind while
this is being designed. We just shouldn't start too big. This is about
getting a proper abstraction in place, not making pg pass security
compliance stuff. Don't mix those too much.

If we end up not being able to provide everything for all of the
libraries we support then perhaps we can document which are available
from all of them, but I'd hope the list of only in X is pretty small.

I'm pretty sure that we can't build a reasonable list of the information
exposed by any library. Especially as we're likely going to need some
mapping to agree to map to the common names.

Per Apache's documentation, mod_ssl and mod_gnutls support the same set
of environment variables (with the same names even), so I don't buy this
argument either.

Gnutls is quite similar from what it provides to openssl. That's not
saying much. Schannel would be more interesting from that point of view.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```Stephen Frost wrote:
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:

Indeed, the ODBC driver only uses libpq for authentication, then
calls PQgetssl(), and takes over the whole show calling SSL_read()
and SSL_write() itself. Ideally, we'd modify psqlodbc to stop doing
that, but that's not an easy job. In the short-term, I think we need
to export pqsecure_read() and pqsecure_write() functions in libpq,
so that the ODBC driver can use those instead of SSL_read() and
SSL_write().

Yeah, that's what I remembered.  There was an attempt to make that
change at one point, but it was reverted due to the lack of batching
ability in libpq (without resorting to cursors, as I recall...),
requiring double the memory usage.  Still, if pqsecure_read and
pqsecure_write are sufficient to make the ODBC driver work, that's good
news.  I had been worried it did other things with the OpenSSL struct
beyond just using those.

Um, libpq has recently gained the ability to return result fragments,
right?  Those didn't exist when libpq-ification of odbc was attempted,
as I recall -- perhaps it's possible now.

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Andres Freund (and...@2ndquadrant.com) wrote:
Per Apache's documentation, mod_ssl and mod_gnutls support the same set
of environment variables (with the same names even), so I don't buy this
argument either.

Gnutls is quite similar from what it provides to openssl. That's not
saying much. Schannel would be more interesting from that point of view.

Fine- but let's at least start with what two of the three support and
figure out if there's actually an issue getting this information from
Schannel.  I'd be surprised if there really is, but I'm a lot happier
starting with a larger set and then considering if we can live without
certain things than trying to build up one-by-one over major releases.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
Stephen Frost wrote:
Yeah, that's what I remembered.  There was an attempt to make that
change at one point, but it was reverted due to the lack of batching
ability in libpq (without resorting to cursors, as I recall...),
requiring double the memory usage.  Still, if pqsecure_read and
pqsecure_write are sufficient to make the ODBC driver work, that's good
news.  I had been worried it did other things with the OpenSSL struct
beyond just using those.

Um, libpq has recently gained the ability to return result fragments,
right?  Those didn't exist when libpq-ification of odbc was attempted,
as I recall -- perhaps it's possible now.

I was trying to remember off-hand if we still had that or not..  I
thought there was discussion about removing it, actually, but perhaps
that was something else.

I agree that having that would definitely help with the ODBC driver.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```
On 08/19/2014 06:44 PM, Stephen Frost wrote:

Hmm. That seems a bit too much. Perhaps provide just the certificate
itself in DER/PEM format, and have the client parse it (using
OpenSSL or something else) if it wants more details.

I really don't care for that approach.  Our SSL support has always been
horrible- I was hoping we'd actually improve that situation.  Adding
things in piecemeal over time will just be painful for our users and I
don't see why we should wait.

What would you like to do with the certificates?

I'm imagining that a GUI tool like pgAdmin might want to extract all
information from the certificate, display it in a window, and let the
user look at the whole chain and all the fields. Like a browser does
when you click the little lock icon in the address bar. That would be a
nice feature, but it's a huge effort to expose *all* certificate
information through attributes, especially if you want to support
multiple SSL libraries. If there was a generic get attribute X
interface in OpenSSL and all the other SSL libraries we wish to support,
we could provide a pass-through mechanism for that, so that e.g all
attributes that OpenSSL exposes were mapped to server_cert_*. But I
don't think that exists in OpenSSL, let alone in other libraries, and
the attribute names would be all different anyway.

So that's not really feasible.

But if we provide an interface to grab the whole certificate chain, then
you can use any library you want to parse and present it to the user.
You could use OpenSSL, but you could also use a more light-weight parser
like libtasn1, or if you're writing a python app for example, whatever
x509 certificate handling library they have. You wouldn't be *verifying*
the certificates - that's handled by libpq (or rather, the SSL library
that libpq uses) - so no cryptography required.

Or you could just pass the whole cert to a 3rd party program
specifically written to display x509 certificates, and let it do the
parsing. I'll mention that the Windows Crypto API has a built-in
function called CryptUIDlgViewCertificate that pops up a dialog for
viewing the certificate. Very handy. I think it's the same dialog that
Internet Explorer uses.

If you want to write such a GUI from scratch, anyway, I think you would
be better off to *not* rely on libpq functions, so that you could use
the same GUI in other contexts too. Like to view an arbitrary
certificate file on the filesystem.

That said, if there's a need to extract some specific fields for some
other purpose than displaying the whole certificate to the user, let's
hear it.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```
On 08/19/2014 07:10 PM, Alvaro Herrera wrote:

Stephen Frost wrote:

* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:

Indeed, the ODBC driver only uses libpq for authentication, then
calls PQgetssl(), and takes over the whole show calling SSL_read()
and SSL_write() itself. Ideally, we'd modify psqlodbc to stop doing
that, but that's not an easy job. In the short-term, I think we need
to export pqsecure_read() and pqsecure_write() functions in libpq,
so that the ODBC driver can use those instead of SSL_read() and
SSL_write().

Yeah, that's what I remembered.  There was an attempt to make that
change at one point, but it was reverted due to the lack of batching
ability in libpq (without resorting to cursors, as I recall...),
requiring double the memory usage.  Still, if pqsecure_read and
pqsecure_write are sufficient to make the ODBC driver work, that's good
news.  I had been worried it did other things with the OpenSSL struct
beyond just using those.

Um, libpq has recently gained the ability to return result fragments,
right?  Those didn't exist when libpq-ification of odbc was attempted,
as I recall -- perhaps it's possible now.

IIRC the thing that psqlodbc does that libpq doesn't support is sending
multiple queries to the backend, and then wait for *all* the replies to
arrive, in a single round-trip. The closest thing is using PQexec(foo;
bar;), but that's quite limited.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```Stephen Frost sfr...@snowman.net writes:
Um, libpq has recently gained the ability to return result fragments,
right?  Those didn't exist when libpq-ification of odbc was attempted,
as I recall -- perhaps it's possible now.

I was trying to remember off-hand if we still had that or not..  I
thought there was discussion about removing it, actually, but perhaps
that was something else.

Sure,
http://www.postgresql.org/docs/devel/static/libpq-single-row-mode.html
That's a done deal, it won't be going away.

Whether it would solve ODBC's problem I don't know (and I'm not
volunteering to do the work ;-))

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```
On 08/19/2014 06:52 PM, Stephen Frost wrote:

No. We should build something that's suitable for postgres, not
something general. We'll fail otherwise. For anything fancy the user has
to look at the certificate themselves. We should make it easy to get at
the whole certificate chain in a consistent manner.

I don't buy this argument at all.

Telling users they simply can't have this information isn't
acceptable.

Meh. Why? Most of that isn't something a normal libpq user is going to
need.

I'm not interested in SSL support for users who don't use or care about
SSL (which would be 'normal libpq users', really).  I've *long* been
frustrated by our poor support of SSL and at how painful it is to get
proper SSL working- and it's been a real problem getting PG to pass the
security compliance requirements because of that poor support.  Let's
stop the rhetoric that PG doesn't need anything but the most basic
SSL/auditing/security capabilities.

I think you just packed up the goalposts for a one-way trip to Mars, but
I wonder: What would you consider proper SSL support? What exactly are
we missing?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```
On 08/19/2014 06:00 PM, Magnus Hagander wrote:

On Tue, Aug 19, 2014 at 4:48 PM, Stephen Frost sfr...@snowman.net wrote:

* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:

server_cert_valid: Did the server present a valid certificate?
yes or no

server_cert_matches_host: Does the Common Name of the certificate
match the host connected to? yes or no

Aren't these questions addressed by sslmode?

Not entirely. You can have sslmode=require and have a matching
certificate. You don't *have* to have sslmode=verify-full for that.

However, whether it makes *sense* without sslmode is another story -
but assuming you use something like kerberos for auth, it might. For

Hmm, right, because the client application doesn't get control between
libpq doing the SSL negotiation and sending the password to the server.
So if after connecting you decided that you don't actually trust the

You might think that you could try connecting without password first,
and try again with the password, but that's not safe either, because
there's no guarantee that the second connection reaches the same server
as the first one.

I think we need a callback or new asynchronous polling state after SSL
negotiation but before libpq sends the password to the server. But
that's a separate feature and patch.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] replication commands and log_statements

```On Sat, Aug 16, 2014 at 10:27 AM, Amit Kapila amit.kapil...@gmail.com wrote:
I think ideally it would have been better if we could have logged
replication commands under separate log_level, but as still there
is no consensus on extending log_statement and nobody is even
willing to pursue, it seems okay to go ahead and log these under
'all' level.

I think the consensus is clearly for a separate GUC.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

```On Sat, Aug 16, 2014 at 9:31 AM, Tomas Vondra t...@fuzzy.cz wrote:
On 12.8.2014 00:30, Tomas Vondra wrote:
On 11.8.2014 20:25, Robert Haas wrote:
It also strikes me that when there's only 1 batch, the set of bits
that map onto the batch number is zero-width, and one zero-width bit
range is as good as another.  In other words, if you're only planning
to do one batch, you can easily grow the number of buckets on the fly.
Growing the number of buckets only becomes difficult once you have
more than one batch.

...

I was considering using reversing the bits of the hash, because that's
pretty much the simplest solution. But I think you're right it might
actually work like this:

* Are more batches needed?

(yes) = just use nbuckets = my_log2(work_mem / tuple_size)

(no) = go ahead, until processing all tuples or hitting work_mem

(work_mem) = meh, use the same nbuckets above

(all tuples) = compute optimal nbuckets / resize

But I need to think about this a bit. So far it seems to me there's no
way additional batches might benefit from increasing nbuckets further.

I think this is a simple and solid solution, solving the batchno
computation issues quite nicely. Attached is v10 patch (bare and
combined with the dense allocation), that does this:

1) when we know we'll need batching, buckets are sized for full work_mem
(using the estimated tuple width, etc.)

2) without the batching, we estimate the 'right number of buckets' for
the estimated number of tuples, and keep track of the optimal number
as tuples are added to the hash table

- if we discover we need to start batching, we keep the current
optimal value (which should be the same as the max number of
buckets) and don't mess with it anymore (making it possible to
compute batch IDs just like before)

- also, on the fist rebatch (nbatch=1 = nbatch=2) the hash table
is resized as part of the rebatch

- if the hash build completes without batching, we do the resize

I believe the patch is pretty much perfect. I plan to do more thorough
testing on a wide range of queries in the next few days.

I also removed the 'enable_hash_resize' GUC, because it would be more
complex to implement this properly after doing the resize as part of
rebatch etc.. So either it would make the patch more complex, or it
wouldn't do what the name promises.

A variety of trivial comments on this:

PostgreSQL style is un-cuddled curly braces.  Also, multi-line
line containing only */.  In one place you've added curly braces
around a single-line block that is otherwise unmodified; please don't
do that.  In one place, you have becase instead of because.  In
another place, you write add if after it but it should say add it
after it or maybe better add the new one after it.  Avoid using
punctuation like = in comments to illustrate the connection between
sentences; instead, use a connecting word like then or therefore
or whatever is appropriate; in this instance, a period followed by the
start of a new sentence seems sufficient.  Revert the removal of a
single line of whitespace near the top of nodeHash.c.

There are too many things marked XXX in this patch.  They should
either be fixed, if they are real problems, or they should be
commented in a way that doesn't give rise to the idea that they're
problems if they aren't.

OK, now on to some more substantive stuff:

1. It's not clear to me what the overall effect of this patch on
memory utilization is.  Reducing NTUP_PER_BUCKET from 10 to 1 is going
to use, on the average, 10x as much bucket-header memory per tuple.
Specifically, I think it means we'll use about 8 bytes of
bucket-header memory per tuple instead of 0.8 bytes per tuple.  If the
tuples are narrow, that could be significant; concerns have been
expressed about that here in the past.  Increasing the number of
buckets could also increase memory usage.  On the other hand, the
dense allocation stuff probably saves a ton of memory, so maybe we end
up overall, but I'm not sure.  Your thoughts, and maybe some test
results with narrow and wide tuples, would be appreciated.

2. But, on the positive side, modulo the memory utilization questions
mentioned above, I would expect the impact on hash join performance to
be positive.  Going from 10 tuples per bucket to just 1 should help,
and on cases where the actual load factor would have ended up much
higher because of poor estimation, increasing the number of buckets on
the fly should help even more.  I haven't tested this, though.

I haven't had a chance to completely go through this yet, so these are
just some initial thoughts.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
```

### Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

```So, it seems like you're basically using malloc to work around the
fact that a palloc failure is an error, and we can't throw an error in
a critical section.  I don't think that's good; we want all of our
allocations, as far as possible, to be tracked via palloc.  It might
be a good idea to add a new variant of palloc or MemoryContextAlloc
that returns NULL on failure instead of throwing an error; I've wanted
that once or twice.  But in this particular case, I'm not quite seeing
why it should be necessary

I am using malloc to return NULL in case of failure and proceed without
compression of FPW ,if it returns NULL.
Proceeding without compression seems to be more accurate than throwing an
error and exiting because of failure to allocate memory for compression.

the number of backup blocks per record is
limited to some pretty small number, so it ought to be possible to
preallocate enough memory to compress them all, perhaps just by
declaring a global variable like char wal_compression_space[8192]; or
whatever.

In the updated patch  a static global variable is added to which memory is
allocated from heap using malloc outside critical section. The size of the
memory block is 4 * BkpBlock header + 4 * BLCKSZ.

Thank you,

On Mon, Aug 18, 2014 at 10:40 PM, Robert Haas robertmh...@gmail.com wrote:

On Thu, Jul 3, 2014 at 3:58 PM, Rahila Syed rahilasye...@gmail.com
wrote:
Updated version of patches are attached.
Changes are as follows
2. Addition of block_compression field in BkpBlock structure to store
information about compression of block. This provides for switching
compression on/off and changing compression algorithm as required.
3.Handling of OOM in critical section by checking for return value of
malloc
and proceeding without compression of FPW if return value is NULL.

So, it seems like you're basically using malloc to work around the
fact that a palloc failure is an error, and we can't throw an error in
a critical section.  I don't think that's good; we want all of our
allocations, as far as possible, to be tracked via palloc.  It might
be a good idea to add a new variant of palloc or MemoryContextAlloc
that returns NULL on failure instead of throwing an error; I've wanted
that once or twice.  But in this particular case, I'm not quite seeing
why it should be necessary - the number of backup blocks per record is
limited to some pretty small number, so it ought to be possible to
preallocate enough memory to compress them all, perhaps just by
declaring a global variable like char wal_compression_space[8192]; or
whatever.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

```

### Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

```On 19.8.2014 19:05, Robert Haas wrote:
On Sat, Aug 16, 2014 at 9:31 AM, Tomas Vondra t...@fuzzy.cz wrote:
On 12.8.2014 00:30, Tomas Vondra wrote:
On 11.8.2014 20:25, Robert Haas wrote:
It also strikes me that when there's only 1 batch, the set of bits
that map onto the batch number is zero-width, and one zero-width bit
range is as good as another.  In other words, if you're only planning
to do one batch, you can easily grow the number of buckets on the fly.
Growing the number of buckets only becomes difficult once you have
more than one batch.

...

I was considering using reversing the bits of the hash, because that's
pretty much the simplest solution. But I think you're right it might
actually work like this:

* Are more batches needed?

(yes) = just use nbuckets = my_log2(work_mem / tuple_size)

(no) = go ahead, until processing all tuples or hitting work_mem

(work_mem) = meh, use the same nbuckets above

(all tuples) = compute optimal nbuckets / resize

But I need to think about this a bit. So far it seems to me there's no
way additional batches might benefit from increasing nbuckets further.

I think this is a simple and solid solution, solving the batchno
computation issues quite nicely. Attached is v10 patch (bare and
combined with the dense allocation), that does this:

1) when we know we'll need batching, buckets are sized for full work_mem
(using the estimated tuple width, etc.)

2) without the batching, we estimate the 'right number of buckets' for
the estimated number of tuples, and keep track of the optimal number
as tuples are added to the hash table

- if we discover we need to start batching, we keep the current
optimal value (which should be the same as the max number of
buckets) and don't mess with it anymore (making it possible to
compute batch IDs just like before)

- also, on the fist rebatch (nbatch=1 = nbatch=2) the hash table
is resized as part of the rebatch

- if the hash build completes without batching, we do the resize

I believe the patch is pretty much perfect. I plan to do more thorough
testing on a wide range of queries in the next few days.

I also removed the 'enable_hash_resize' GUC, because it would be more
complex to implement this properly after doing the resize as part of
rebatch etc.. So either it would make the patch more complex, or it
wouldn't do what the name promises.

A variety of trivial comments on this:

PostgreSQL style is un-cuddled curly braces.  Also, multi-line
line containing only */.  In one place you've added curly braces
around a single-line block that is otherwise unmodified; please don't
do that.  In one place, you have becase instead of because.  In
another place, you write add if after it but it should say add it
after it or maybe better add the new one after it.  Avoid using
punctuation like = in comments to illustrate the connection between
sentences; instead, use a connecting word like then or therefore
or whatever is appropriate; in this instance, a period followed by the
start of a new sentence seems sufficient.  Revert the removal of a
single line of whitespace near the top of nodeHash.c.

There are too many things marked XXX in this patch.  They should
either be fixed, if they are real problems, or they should be
commented in a way that doesn't give rise to the idea that they're
problems if they aren't.

OK, thanks for pointing this out. Attached is v11 of the patch (both
separate and combined with the dense allocation, as before).

I fixed as many of those issues as possible. All the XXX items were
obsolete, except for one in the chunk_alloc function.

I have also removed one constant

OK, now on to some more substantive stuff:

1. It's not clear to me what the overall effect of this patch on
memory utilization is.  Reducing NTUP_PER_BUCKET from 10 to 1 is going
to use, on the average, 10x as much bucket-header memory per tuple.
Specifically, I think it means we'll use about 8 bytes of
bucket-header memory per tuple instead of 0.8 bytes per tuple.  If the
tuples are narrow, that could be significant; concerns have been
expressed about that here in the past.  Increasing the number of
buckets could also increase memory usage.  On the other hand, the
dense allocation stuff probably saves a ton of memory, so maybe we end
up overall, but I'm not sure.  Your thoughts, and maybe some test
results with narrow and wide tuples, would be appreciated.

The effect of the dense allocation was briefly discussed in this thread,
along with some quick measurements:

http://www.postgresql.org/message-id/53beea9e.2080...@fuzzy.cz

The dense allocation removes pretty much all the palloc overhead. For a
40B tuple, I did get this before the dense allocation

HashBatchContext: 1451221040 total in 182 blocks; 2826592 free (11
chunks); ```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Tom Lane (t...@sss.pgh.pa.us) wrote:
Stephen Frost sfr...@snowman.net writes:
Um, libpq has recently gained the ability to return result fragments,
right?  Those didn't exist when libpq-ification of odbc was attempted,
as I recall -- perhaps it's possible now.

I was trying to remember off-hand if we still had that or not..  I
thought there was discussion about removing it, actually, but perhaps
that was something else.

Sure,
http://www.postgresql.org/docs/devel/static/libpq-single-row-mode.html
That's a done deal, it won't be going away.

Ugh.  Yes, there's single-row mode, but I had been thinking there was a
'batch' mode available ala what OCI8 had, where you'd allocate a chunk
of memory and then have it filled directly by the library as rows came
back in until it was full (there was a similar 'bulk send' operation, as
I recall).  Perhaps it was the 'pipelining' thread that I was thinking
about.  Not really relevant, in any case.

Whether it would solve ODBC's problem I don't know (and I'm not
volunteering to do the work ;-))

It could work..  though it's certainly been a while since I looked at
the ODBC internals.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On Mon, Aug 18, 2014 at 7:54 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
In order to support alternatives to OpenSSL, we need to wean off
applications from using PQgetssl(). To do that, we have to provide an
alternative API to get the same information. PQgetSSL() returns a pointer
directly to the OpenSSL private struct, and you can do anything with that.
We cannot have a generic interface that exposes everything, so we need to
identify the information that people actually want, and expose that.

I have a hard time believing that something like this will really
satisfy anyone.  Why not just add PQgetSchannelHandleOrWhatever() and
call it good?  We can try to be incredibly thorough in exposing the
information people want and we will still inevitably miss something
that someone cares about; worse, we'll spend an awful lot of time and
energy along the way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On Tue, Aug 19, 2014 at 8:49 PM, Robert Haas robertmh...@gmail.com wrote:
On Mon, Aug 18, 2014 at 7:54 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
In order to support alternatives to OpenSSL, we need to wean off
applications from using PQgetssl(). To do that, we have to provide an
alternative API to get the same information. PQgetSSL() returns a pointer
directly to the OpenSSL private struct, and you can do anything with that.
We cannot have a generic interface that exposes everything, so we need to
identify the information that people actually want, and expose that.

I have a hard time believing that something like this will really
satisfy anyone.  Why not just add PQgetSchannelHandleOrWhatever() and
call it good?  We can try to be incredibly thorough in exposing the
information people want and we will still inevitably miss something
that someone cares about; worse, we'll spend an awful lot of time and
energy along the way.

Well, for one you push the full burden onto the application. Then
every application has to support every SSL library we do, even for the
simplest check. And it has to be built against the same one. (So for
example if someone wants to use openssl on windows - yes there might
still be reasons for that even if we support schannel - they have to
rebuild every one of their applications. And every one of their higher
level language drivers sitting on top of openssl).

The same problem of course appears on say Linux, if you end up using a
mix of openssl and gnutls or a mix of nss and openssl for example.
It's not likely to happen as long as you only use the officially built
packages, but you're likely in for quite a bit of pain if you are
using any non-standard packaging like the oneclick installers etc.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```Magnus Hagander mag...@hagander.net writes:
On Tue, Aug 19, 2014 at 8:49 PM, Robert Haas robertmh...@gmail.com wrote:
I have a hard time believing that something like this will really
satisfy anyone.  Why not just add PQgetSchannelHandleOrWhatever() and
call it good?  We can try to be incredibly thorough in exposing the
information people want and we will still inevitably miss something
that someone cares about; worse, we'll spend an awful lot of time and
energy along the way.

Well, for one you push the full burden onto the application.

Robert's got a point though: there is always going to be somebody who
wants something we fail to expose.  It's better to be able to say well,
you can do PQgetssl and then munge it for yourself than to have to say
sorry, you're screwed.  So if we're going to define PQgetssl as
returning NULL when you're not using OpenSSL, I don't see why we
shouldn't expose a similarly-defined PQgetXXX for each other underlying
implementation we support.  There will not be that many of 'em, and
I suspect the people with very specific needs will not care about more
than one underlying library anyway.

This does not say that we shouldn't also try to have some
library-independent functionality for interrogating certificate state
etc.  Just that having an escape hatch isn't a bad thing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On Tue, Aug 19, 2014 at 9:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Magnus Hagander mag...@hagander.net writes:
On Tue, Aug 19, 2014 at 8:49 PM, Robert Haas robertmh...@gmail.com wrote:
I have a hard time believing that something like this will really
satisfy anyone.  Why not just add PQgetSchannelHandleOrWhatever() and
call it good?  We can try to be incredibly thorough in exposing the
information people want and we will still inevitably miss something
that someone cares about; worse, we'll spend an awful lot of time and
energy along the way.

Well, for one you push the full burden onto the application.

Robert's got a point though: there is always going to be somebody who
wants something we fail to expose.  It's better to be able to say well,
you can do PQgetssl and then munge it for yourself than to have to say
sorry, you're screwed.  So if we're going to define PQgetssl as
returning NULL when you're not using OpenSSL, I don't see why we
shouldn't expose a similarly-defined PQgetXXX for each other underlying
implementation we support.  There will not be that many of 'em, and
I suspect the people with very specific needs will not care about more
than one underlying library anyway.

This does not say that we shouldn't also try to have some
library-independent functionality for interrogating certificate state
etc.  Just that having an escape hatch isn't a bad thing.

I do agree tha thaving both would be useful. We could have something like
int PQgetSSLstruct(void **sslstruct)

which returns the type of struct. Then it's up to the application to
know if it can handle it. For those apps that need a *lot*. But the
basic attributes - something like the list from apache - should be
retrievable in a library independent way.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
On 08/19/2014 06:44 PM, Stephen Frost wrote:
Hmm. That seems a bit too much. Perhaps provide just the certificate
itself in DER/PEM format, and have the client parse it (using
OpenSSL or something else) if it wants more details.
I really don't care for that approach.  Our SSL support has always been
horrible- I was hoping we'd actually improve that situation.  Adding
things in piecemeal over time will just be painful for our users and I
don't see why we should wait.

What would you like to do with the certificates?

In applications which I've developed in the past, I've had to rely on
the CN, serial number, and signing root CA to ensure that there were no
duplicates (this was a government environment which trusted multiple
independent root CAs, and there was no guarantee that even a given CA
wouldn't issue the same serial number to different individuals).  In
other cases, I've had to rely on the fingerprint, but that gets painful
when you have certificate roll-over since you then have to re-enroll
individuals when they get issued a new certificate.  I've also
implemented systems which have certificate expiration warnings.
Checking the extended attributes of the certificate has been a
requirement in the past (to verify it's only being used for its intended
purpose).

One of the things we don't support today is anything beyond matching on
the CN of the certificate in pg_ident, to map from a client certificate
to a PG role.  That wouldn't be acceptable in environments I've worked
in because two different individuals could have identical CNs.  Another
interesting twist are systems (such as Windows..) where the client
certificate to be presented depends on which root CA the server's
certificate is signed with.

I'm not asking this patch to fix that, but you asked what else a
developer might be looking for when it comes to SSL and I'm telling you
things I've actively used.  Generally speaking, these have been on the
server side (eg: with mod_ssl), but I could see a client wanting to use
them, and if we abstract getting this information on the server side to
meet the needs I've described above, wouldn't we be able to (and want
to) share that abstraction with users of libpq?

I'm imagining that a GUI tool like pgAdmin might want to extract all
information from the certificate, display it in a window, and let
the user look at the whole chain and all the fields.

While that'd certainly be nice, it's not what I'm referring to and I
agree that having a third party library to handle that makes sense, as
some operating systems do.  In general, I'm all for more (and better)
integration with the OS-provided certificate systems.  For one thing,
they also can address the issues around ensuring that the client side
certificate is encrypted-at-rest, and can handle prompting the user for
the passphrase to decrypt it.

But I don't think that exists in OpenSSL, let alone
in other libraries, and the attribute names would be all different
anyway.

As I said- let's look at mod_ssl/gnutls as a minimum set to start with..
That's certainly a set I'm familiar with and one which I expect most
other developers who work with SSL are also.  There are bits missing
from that list (mainly around the extended attributes..), but it's
certainly better than the list originally proposed.

But if we provide an interface to grab the whole certificate chain,
then you can use any library you want to parse and present it to the
user.

Yes- we should do this also because there may be cases where the app
developers wants to pass that off to another library or do something
else with it, sure.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
I think you just packed up the goalposts for a one-way trip to Mars,
but I wonder: What would you consider proper SSL support? What
exactly are we missing?

I hit on a few things in my other email, but there is a huge portion of
SSL which is just about making it easy and sensible to install and get
working properly.  Apache is a good example of how to do this and is one
that a lot of people are familiar with.  Specific issues that I recall
running into are lack of the 'directory' options for certificates,
having trouble figuring out the right format and structure to provide
the complete root chain for the server's certificate and then trying to
figure out how to add intermediate and additional root CAs for client
certificates, getting CRLs to work was a pain, and nothing about how to
get OCSP working.

I think there's been some improvement since I last had to go through the
pain of setting this all up, and some of it is undoubtably OpenSSL's
fault, but there's definitely quite a bit more we could be doing to make
SSL support easier.  I'm hopeful that I'll be able to spend more time on
this in the future but it's not a priority currently.

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On Tue, Aug 19, 2014 at 3:16 PM, Magnus Hagander mag...@hagander.net wrote:
On Tue, Aug 19, 2014 at 9:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Magnus Hagander mag...@hagander.net writes:
On Tue, Aug 19, 2014 at 8:49 PM, Robert Haas robertmh...@gmail.com wrote:
I have a hard time believing that something like this will really
satisfy anyone.  Why not just add PQgetSchannelHandleOrWhatever() and
call it good?  We can try to be incredibly thorough in exposing the
information people want and we will still inevitably miss something
that someone cares about; worse, we'll spend an awful lot of time and
energy along the way.

Well, for one you push the full burden onto the application.

Robert's got a point though: there is always going to be somebody who
wants something we fail to expose.  It's better to be able to say well,
you can do PQgetssl and then munge it for yourself than to have to say
sorry, you're screwed.  So if we're going to define PQgetssl as
returning NULL when you're not using OpenSSL, I don't see why we
shouldn't expose a similarly-defined PQgetXXX for each other underlying
implementation we support.  There will not be that many of 'em, and
I suspect the people with very specific needs will not care about more
than one underlying library anyway.

This does not say that we shouldn't also try to have some
library-independent functionality for interrogating certificate state
etc.  Just that having an escape hatch isn't a bad thing.

I do agree tha thaving both would be useful. We could have something like
int PQgetSSLstruct(void **sslstruct)

I think it's likely smarter to have totally separate functions.
First, to make it less likely that users will try to use a pointer to
one type of object as a pointer to some other kind of object.  And
second, because you might, for example, someday have an SSL
implementation that wants to return two pointers.  May as well make
that kind of thing easy.

BTW, if we're beating on libpq, I wonder if we shouldn't consider
bumping the soversion at some point.  I mean, I know that we
technically don't need to do that if we're only *adding* functions and
not changing any of the existing stuff in backward-incompatible ways,
but we might *want* to make some backward-incompatible changes at some
point, and I think there's a decent argument that any patch in this
are is already doing that at least to PQgetSSL().  Maybe this would be
a good time to think if there's anything else we want to do that
would, either by itself or in combination, justify a bump.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

```On 8/19/14 9:35 AM, MauMau wrote:
pg_copy is for copying a file reliably by syncing.  If sync is not
necessary, people can use cp/copy.

I'm getting mixed messages from this thread.

I think there could be a fair amount of support for a new tool that can
serve as a universal plug-and-play archive_command, with a variety of
options, such as fsync yes/no, fadvise yes/no, direct-io[*] yes/no,
atomic copy yes/no, allow overwrite yes/no, compression yes/no.  That
would reduce the need for users to compose adventurous shell commands,
and it would set out the various options more clearly.

This is not that.  This is cp+fsync with a hardcoded fadvise policy and
optional direct-io.  That is a valid problem to address, but it is one
of many.  On the other hand, I fear that the addition of this
single-and-a-half-purpose tool would make the overall landscape more
complicated than it already is.  Since it's in the examples, people will
probably use it, even if they don't need to or shouldn't.  And not
recommending it for the restore_command is also confusing.

Another example of how confusing all of this is: On Windows, the copy
command by default doesn't overwrite files, which is what we want
(usually).  The patch changes those instances of copy to pg_copy, but it
doesn't have that behavior.  Should the examples by changed to do a test
pg_copy on Windows (what's the Windows shell syntax for that?), or
should pg_copy have an option to not overwrite a file?  How do you then
avoid inconsistencies with the Unix behavior?  Or what if you want fsync
but allow overwriting on Windows?

On the technical side, I think if you fsync a new file, you also need to
fsync the directory, to make sure the file is certain to be visible
after a crash.

[*] I keep reading directio as a typo of direction, so please
consider putting a hyphen or underscore in the option and variable
names. ;-)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

```On 8/15/14 10:46 AM, Fujii Masao wrote:
At last, the big question is, is there really no OS command which provides
the same functionality as pg_copy does? If there is, I'd like to avoid
duplicate
work basically.

If you look hard enough, you can maybe find an OS command that can fsync
a file after it was copied.  Some versions of dd can do that, and some
systems have an fsync program.  But it's not clear whether all systems
have that, and it probably won't be simple and consistent.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```* Robert Haas (robertmh...@gmail.com) wrote:
BTW, if we're beating on libpq, I wonder if we shouldn't consider
bumping the soversion at some point.  I mean, I know that we
technically don't need to do that if we're only *adding* functions and
not changing any of the existing stuff in backward-incompatible ways,
but we might *want* to make some backward-incompatible changes at some
point, and I think there's a decent argument that any patch in this
are is already doing that at least to PQgetSSL().  Maybe this would be
a good time to think if there's anything else we want to do that
would, either by itself or in combination, justify a bump.

I'm not a big fan of doing it for this specific item, though it's
technically an API breakage (which means we should actually have
libpq2-dev packages, make everything that build-deps on libpq-dev
update to build-dep on libpq2-dev, have libpq6, etc..).  If there are
other backwards-incompatible things we wish to do, then I agree that
it'd be good to do them all at the same time (perhaps in conjunction
with 10.0...).  This is the part where I wish we had been keeping an
updated list of things we want to change (like on the wiki..).

It's certainly not a fun transistion to go through.  I also wonder if
we're going to need to worry about what happens when libpq5 and libpq6
end up linked into the same running application.  I don't think we
have any symbol versioning or anything to address that risk in place..

Thanks,

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

```On 08/19/2014 12:37 PM, Peter Eisentraut wrote:
On 8/19/14 9:35 AM, MauMau wrote:
pg_copy is for copying a file reliably by syncing.  If sync is not
necessary, people can use cp/copy.

I'm getting mixed messages from this thread.

I think there could be a fair amount of support for a new tool that can
serve as a universal plug-and-play archive_command, with a variety of
options, such as fsync yes/no, fadvise yes/no, direct-io[*] yes/no,
atomic copy yes/no, allow overwrite yes/no, compression yes/no.  That
would reduce the need for users to compose adventurous shell commands,
and it would set out the various options more clearly.

Although I'm not sold on the idea of using DirectIO for this.  Is there
really enough benefit to make it worth the trouble?

This is not that.  This is cp+fsync with a hardcoded fadvise policy and
optional direct-io.  That is a valid problem to address, but it is one
of many.  On the other hand, I fear that the addition of this
single-and-a-half-purpose tool would make the overall landscape more
complicated than it already is.  Since it's in the examples, people will
probably use it, even if they don't need to or shouldn't.  And not
recommending it for the restore_command is also confusing.

I'm afraid that I agree with Peter here.  pg_copy looks like a nice
foundation for the eventual pg_copy utility we need, but it's not there yet.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] 9.3: more problems with Could not open file pg_multixact/members/xxxx

```On Tue, Jul 15, 2014 at 3:58 PM, Jeff Janes jeff.ja...@gmail.com wrote:

On Fri, Jun 27, 2014 at 11:51 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

Jeff Janes wrote:

This problem was initially fairly easy to reproduce, but since I
started adding instrumentation specifically to catch it, it has become
devilishly hard to reproduce.

I think my next step will be to also log each of the values which goes
into the complex if (...) expression that decides on the deletion.

Could you please to reproduce it after updating to latest?  I pushed
fixes that should close these issues.  Maybe you want to remove the
instrumentation you added, to make failures more likely.

There are still some problems in 9.4, but I haven't been able to diagnose
them and wanted to do more research on it.  The announcement of upcoming
back-branches for 9.3 spurred me to try it there, and I have problems with
9.3 (12c5bbdcbaa292b2a4b09d298786) as well.  The move of truncation to the
checkpoint seems to have made the problem easier to reproduce.  On an 8
core machine, this test fell over after about 20 minutes, which is much
faster than it usually reproduces.

This the error I get:

2084 UPDATE 2014-07-15 15:26:20.608 PDT:ERROR:  could not access status of
transaction 85837221
2084 UPDATE 2014-07-15 15:26:20.608 PDT:DETAIL:  Could not open file
pg_multixact/members/14031: No such file or directory.
2084 UPDATE 2014-07-15 15:26:20.608 PDT:CONTEXT:  SQL statement SELECT 1
FROM ONLY public.foo_parent x WHERE id OPERATOR(pg_catalog.=) \$1 FOR
KEY SHARE OF x

The testing harness is attached as 3 patches that must be made to the test
server, and 2 scripts. The script do.sh sets up the database (using fixed
paths, so be careful) and then invokes count.pl in a loop to do the
actual work.

Sorry, after a long time when I couldn't do much testing on this, I've now
been able to get back to it.

It looks like what is happening is that  checkPoint.nextMultiOffset wraps
around from 2^32 to 0, even if 0 is still being used.  At that point it
starts deleting member files that are still needed.

Is there some interlock which is supposed to prevent from
checkPoint.nextMultiOffset rom lapping iself?  I haven't been able to find
it.  It seems like the interlock applies only to MultiXid, not the Offset.

Thanks,

Jeff

```

### Re: [HACKERS] 9.3: more problems with Could not open file pg_multixact/members/xxxx

```On August 19, 2014 10:24:20 PM CEST, Jeff Janes jeff.ja...@gmail.com wrote:
On Tue, Jul 15, 2014 at 3:58 PM, Jeff Janes jeff.ja...@gmail.com
wrote:

On Fri, Jun 27, 2014 at 11:51 AM, Alvaro Herrera
wrote:

Jeff Janes wrote:

This problem was initially fairly easy to reproduce, but since I
started adding instrumentation specifically to catch it, it has
become
devilishly hard to reproduce.

I think my next step will be to also log each of the values which
goes
into the complex if (...) expression that decides on the deletion.

Could you please to reproduce it after updating to latest?  I pushed
fixes that should close these issues.  Maybe you want to remove the
instrumentation you added, to make failures more likely.

There are still some problems in 9.4, but I haven't been able to
diagnose
them and wanted to do more research on it.  The announcement of
upcoming
back-branches for 9.3 spurred me to try it there, and I have problems
with
9.3 (12c5bbdcbaa292b2a4b09d298786) as well.  The move of truncation
to the
checkpoint seems to have made the problem easier to reproduce.  On an
8
core machine, this test fell over after about 20 minutes, which is
much
faster than it usually reproduces.

This the error I get:

2084 UPDATE 2014-07-15 15:26:20.608 PDT:ERROR:  could not access
status of
transaction 85837221
2084 UPDATE 2014-07-15 15:26:20.608 PDT:DETAIL:  Could not open file
pg_multixact/members/14031: No such file or directory.
2084 UPDATE 2014-07-15 15:26:20.608 PDT:CONTEXT:  SQL statement
SELECT 1
FROM ONLY public.foo_parent x WHERE id OPERATOR(pg_catalog.=)
\$1 FOR
KEY SHARE OF x

The testing harness is attached as 3 patches that must be made to the
test
server, and 2 scripts. The script do.sh sets up the database (using
fixed
paths, so be careful) and then invokes count.pl in a loop to do the
actual work.

Sorry, after a long time when I couldn't do much testing on this, I've
now
been able to get back to it.

It looks like what is happening is that  checkPoint.nextMultiOffset
wraps
around from 2^32 to 0, even if 0 is still being used.  At that point it
starts deleting member files that are still needed.

Is there some interlock which is supposed to prevent from
checkPoint.nextMultiOffset rom lapping iself?  I haven't been able to
find
it.  It seems like the interlock applies only to MultiXid, not the
Offset.

There is none (and there never has been one either). I've complained about it a
couple of times but nobody, me included, had time and energy to fix that :(

Andres

---
Please excuse brevity and formatting - I am writing this on my mobile phone.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

```MauMau wrote:

With that said, copying to a temporary file like dest.tmp and
renaming it to dest sounds worthwhile even as a basic copy
utility.  I want to avoid copying to a temporary file with a fixed
name like _copy.tmp, because some advanced utility may want to run
multiple instances of pg_copy to copy several files into the same
directory simultaneously.  However, I'm afraid multiple dest.tmp
files might continue to occupy disk space after canceling copy or
power failure in some use cases, where the copy of the same file
won't be retried.  That's also the reason why I chose to not use a
temporary file like cp/copy.

Is there a way to create a link to a file which only exists as an open
file descriptor?   If there was, you could create a temp file, open an
fd, then delete the file.  That would remove the issue with files being
leaked due to failures of various kinds.

Also, it's been mentioned that this utility might be useful for
restore_command.  That sounds good I guess, but need to keep the
RECOVERYXLOG trick in mind.  I remember a case of stalled replay because
the restore command was writing to RECOVERYXLOG.gz and ungzipping, and
the unlink(RECOVERYXLOG) call failed after a partial copy and so did
the copy from the archive.  (Removing the borked RECOVERYXLOG.gz fixed
it.)

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```
On 08/19/2014 10:31 PM, Robert Haas wrote:

On Tue, Aug 19, 2014 at 3:16 PM, Magnus Hagander mag...@hagander.net wrote:

On Tue, Aug 19, 2014 at 9:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Magnus Hagander mag...@hagander.net writes:

On Tue, Aug 19, 2014 at 8:49 PM, Robert Haas robertmh...@gmail.com wrote:

I have a hard time believing that something like this will really
satisfy anyone.  Why not just add PQgetSchannelHandleOrWhatever() and
call it good?  We can try to be incredibly thorough in exposing the
information people want and we will still inevitably miss something
that someone cares about; worse, we'll spend an awful lot of time and
energy along the way.

Well, for one you push the full burden onto the application.

Robert's got a point though: there is always going to be somebody who
wants something we fail to expose.  It's better to be able to say well,
you can do PQgetssl and then munge it for yourself than to have to say
sorry, you're screwed.  So if we're going to define PQgetssl as
returning NULL when you're not using OpenSSL, I don't see why we
shouldn't expose a similarly-defined PQgetXXX for each other underlying
implementation we support.  There will not be that many of 'em, and
I suspect the people with very specific needs will not care about more
than one underlying library anyway.

This does not say that we shouldn't also try to have some
library-independent functionality for interrogating certificate state
etc.  Just that having an escape hatch isn't a bad thing.

Yeah, wouldn't hurt I guess.

I do agree tha thaving both would be useful. We could have something like
int PQgetSSLstruct(void **sslstruct)

I think it's likely smarter to have totally separate functions.
First, to make it less likely that users will try to use a pointer to
one type of object as a pointer to some other kind of object.  And
second, because you might, for example, someday have an SSL
implementation that wants to return two pointers.  May as well make
that kind of thing easy.

The struct it returns is totally SSL-implementation specific anyway, so
for an implementation that would like to return two structs, you could
well define it to return a struct like:

struct {
CoolStructA *a;
CoolStructB *b;
} CoolSSLStruct;

I don't much like adding a separate function for every SSL
implementation, but you've got a point that it would be nice to make it
difficult to call PQgetSSLstruct() and just assume that the returned
struct is e.g an OpenSSL struct, while it's actually something else.
Perhaps:

int PQgetSSLstruct(void **sslstruct, char *structname)

You'd call it like PQgetSSLStruct(mystruct, openssl), and it checks
that the argument matches the library actually been used, otherwise it
returns an error. And if you need to return two structs, you'd call it
twice: PQgetSSLStruct(a, cool_a) and PQgetSSLStruct(b, cool_b).

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

```
On 08/20/2014 12:17 AM, John Lumby wrote:

I am attaching a new version of the patch for consideration in the current
commit fest.

Thanks for working on this!

Relative to the one I submitted on 25 June in
bay175-w412ff89303686022a9f16aa3...@phx.gbl
the method for handling aio completion using sigevent has been re-written to use
signals exclusively rather than a composite of signals and LWlocks,
and this has fixed the problem I mentioned before with the LWlock method.

ISTM the patch is still allocating stuff in shared memory that really
doesn't belong there. Namely, the BufferAiocb structs. Or at least parts
of it; there's also a waiter queue there which probably needs to live in
shared memory, but the rest of it does not.

At least BufAWaitAioCompletion is still calling aio_error() on an AIO
request that might've been initiated by another backend. That's not OK.

Please write the patch without atomic CAS operation. Just use a
spinlock. There's a patch in the commitfest to add support for that, but
it's not committed yet, and all those USE_AIO_ATOMIC_BUILTIN_COMP_SWAP
ifdefs make the patch more difficult to read. Same with all the other
#ifdefs; please just pick a method that works.

Also, please split prefetching of regular index scans into a separate
patch. It's orthogonal to doing async I/O; we could prefetch regular
index scans with posix_fadvise too, and AIO should be useful for
prefetching other stuff.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On 2014-08-20 00:58:22 +0300, Heikki Linnakangas wrote:
I don't much like adding a separate function for every SSL implementation,
but you've got a point that it would be nice to make it difficult to call
PQgetSSLstruct() and just assume that the returned struct is e.g an OpenSSL
struct, while it's actually something else. Perhaps:

A good reason to not have functions with the respective functions is
declarations of the libraries type.

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

```On Tue, Aug 19, 2014 at 7:27 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
Also, please split prefetching of regular index scans into a separate patch.
It's orthogonal to doing async I/O; we could prefetch regular index scans
with posix_fadvise too, and AIO should be useful for prefetching other
stuff.

That patch already happened on the list, and it wasn't a win in many
cases. I'm not sure it should be proposed independently of this one.
Maybe a separate patch, but it should be considered dependent on this.

I don't have an archive link at hand atm, but I could produce one later.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```Andres Freund and...@2ndquadrant.com writes:
On 2014-08-20 00:58:22 +0300, Heikki Linnakangas wrote:
I don't much like adding a separate function for every SSL implementation,
but you've got a point that it would be nice to make it difficult to call
PQgetSSLstruct() and just assume that the returned struct is e.g an OpenSSL
struct, while it's actually something else. Perhaps:

A good reason to not have functions with the respective functions is
declarations of the libraries type.

It requires no such thing.  What we do for PQgetssl() is declare it as
returning void *, and we could easily do the same for other libraries.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PQgetssl() and alternative SSL implementations

```On 2014-08-19 19:11:46 -0400, Tom Lane wrote:
On 2014-08-20 00:58:22 +0300, Heikki Linnakangas wrote:
I don't much like adding a separate function for every SSL implementation,
but you've got a point that it would be nice to make it difficult to call
PQgetSSLstruct() and just assume that the returned struct is e.g an OpenSSL
struct, while it's actually something else. Perhaps:

A good reason to not have functions with the respective functions is
declarations of the libraries type.

It requires no such thing.  What we do for PQgetssl() is declare it as
returning void *, and we could easily do the same for other libraries.

Well, the reason the library specific variant has been called superiour
upthread is the potential for type safety...

Greetings,

Andres Freund

--
PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] KNN searches support for SP-GiST [GSOC'14]

```Hi there, pg-Hackers!
Here I go with the patch which brings up the possibility to perform
nearest-neighbour searches on SP-GiSTs (as of now includes implementation
for quad and kd trees). Pre-reviewed by my GSoC mentor Alexander Korotkov.
Sample benchmarking script included in the attachment (dumps the current
geonames archive and runs several searches on the (latitude, longitude)
points), which demonstrates the dramatic improvements against plain
searches and sorting. Regression tests included, compiles and runs
successfully under both of my Ubuntu 12.04 Server and 08/2014 Arch Linux.

diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 56827e5..5214770 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -83,6 +83,7 @@
literalgt;gt;/
literalgt;^/
literal~=/
+   literallt;-gt;/
/entry
/row
row
@@ -95,6 +96,7 @@
literalgt;gt;/
literalgt;^/
literal~=/
+   literallt;-gt;/
/entry
/row
row
@@ -137,6 +139,10 @@
supports the same operators but uses a different index data structure which
may offer better performance in some applications.
/para
+ para
+  By supporting the ordering lt;-gt; operator the quad_point_ops and kd_point_ops provide
+  a user with the ability to perform a K-nearest-neighbour search over the indexed point dataset.
+ /para

/sect1

@@ -539,9 +545,12 @@ CREATE FUNCTION my_inner_consistent(internal, internal) RETURNS void ...
typedef struct spgInnerConsistentIn
{
ScanKey scankeys;   /* array of operators and comparison values */
+ScanKey		orderbyKeys;	/* array of ordering operators and comparison values */
int nkeys;  /* length of array */
+int norderbys;  /* length of array */

Datum   reconstructedValue; /* value reconstructed at parent */
+Datum		suppValue		/* supplimentary value of parent */
int level;  /* current level (counting from zero) */
boolreturnData; /* original data must be returned? */

@@ -559,6 +568,8 @@ typedef struct spgInnerConsistentOut
int*nodeNumbers;/* their indexes in the node array */
int*levelAdds;  /* increment level by this much for each */
Datum  *reconstructedValues;/* associated reconstructed values */
+Datum	   *suppValues;		/* any additional data implementation needs to be stored in the child nodes */
+double	   **distances;		/* associated distances */
} spgInnerConsistentOut;
/programlisting

@@ -573,10 +584,15 @@ typedef struct spgInnerConsistentOut
In particular it is not necessary to check structfieldsk_flags/ to
see if the comparison value is NULL, because the SP-GiST core code
will filter out such conditions.
+   structfieldorderbyKeys/, of length structfieldnorderbys/,
+   describes ordering operators (if any) in the same fashion.
structfieldreconstructedValue/ is the value reconstructed for the
parent tuple; it is literal(Datum) 0/ at the root level or if the
functioninner_consistent/ function did not provide a value at the
parent level.
+   structfieldsuppValue/ is any additional value that an implementation
+   decided to store for the parent node (literal(Datum) 0/ in case the
+   current node is root).
structfieldlevel/ is the current inner tuple's level, starting at
zero for the root level.
structfieldreturnData/ is literaltrue/ if reconstructed data is
@@ -592,7 +608,6 @@ typedef struct spgInnerConsistentOut
structfieldnNodes/ is the number of child nodes contained in the
inner tuple, and
structfieldnodeLabels/ is an array of their label values, or
-   NULL if the nodes do not have labels.
/para

para
@@ -608,9 +623,17 @@ typedef struct spgInnerConsistentOut
structfieldreconstructedValues/ to an array of the values
reconstructed for each child node to be visited; otherwise, leave
structfieldreconstructedValues/ as NULL.
+   structfieldsuppValues/ serves the similiar purpose of holding
+   the implementation-defined data for the inner nodes.
+   structfielddistances/ if the ordered search is carried out,
+   the implementation is supposed to fill them in accordance to the
+   ordering operators provided in structfieldorderbyKeys/
+   (nodes with lowest distances will be processed first). Leave it
+   NULL otherwise.
Note that the functioninner_consistent/ function is
responsible for palloc'ing the
+   structfielddistances/, structfieldsuppValues/ and
structfieldreconstructedValues/ arrays.
/para
/listitem
@@ -636,7 +659,9 @@ CREATE FUNCTION ```

### Re: [HACKERS] GSoC on WAL-logging hash indexes

```On Thu, Jun 19, 2014 at 6:40 PM, Vik Fearing vik.fear...@dalibo.com wrote:

On 04/30/2014 11:41 PM, Tom Lane wrote:
We really oughta fix the WAL situation, not just band-aid around it.

After re-reading this thread, it is not clear that anyone is going to
work on it so I'll just ask:

Is anyone working on this?

If not, I'd like to put it on my plate.

Vik, did you get time to look at that finally?
Regards,
--
Michael

```

### Re: [HACKERS] jsonb format is pessimal for toast compression

```On 08/15/2014 04:19 PM, Tom Lane wrote:
Personally I'd prefer to go to the all-lengths approach, but a large
part of that comes from a subjective assessment that the hybrid approach
is too messy.  Others might well disagree.

In case anyone else wants to do measurements on some more data sets,
attached is a copy of Heikki's patch updated to apply against git tip.

Note that this is not 100% comparable because I'm running it against git
clone, and the earlier tests were against beta2.  However, the Heikki
patch looks like a bust on this dataset -- see below.

postgres=# select pg_size_pretty(pg_total_relation_size('jsonic'));
pg_size_pretty

394 MB
(1 row)

postgres=# select pg_size_pretty(pg_total_relation_size('jsonbish'));

pg_size_pretty

542 MB

Extraction Test:

postgres=# explain analyze select row_to_json - 'kt1_total_sum' from
jsonbish where row_to_json @ '{ rpt_per_dt : 2003-06-30 }';
QUERY
PLAN
---
Bitmap Heap Scan on jsonbish  (cost=29.55..582.92 rows=200 width=18)
(actual time=22.742..5281.823 rows=100423 loops=1)
Recheck Cond: (row_to_json @ '{rpt_per_dt: 2003-06-30}'::jsonb)
Heap Blocks: exact=1471
-  Bitmap Index Scan on jsonbish_row_to_json_idx  (cost=0.00..29.50
rows=200 width=0) (actual time=22.445..22.445 rows=100423 loops=1)
Index Cond: (row_to_json @ '{rpt_per_dt: 2003-06-30}'::jsonb)
Planning time: 0.095 ms
Execution time: 5292.047 ms
(7 rows)

So, that extraction test is about 1% *slower* than the basic Tom Lane
lengths-only patch, and still 80% slower than original JSONB.  And it's
the same size as the lengths-only version.

Huh?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

```On Mon, Aug 18, 2014 at 12:30:40PM +0100, Greg Stark wrote:
On Tue, Aug 5, 2014 at 3:41 AM, Noah Misch n...@leadboat.com wrote:
This remains open for 9.4.  Your proposal to revert the feature in 9.4 and
fix
it in 9.5 sounds reasonable.

Ok, I've gone ahead and done this. I'm sorry for the delays and confusion.

Thanks.

I did try psql-wrapped-expanded-fix-v5.patch with the tests Peter and I
posted
upthread, and those tests now behave as they do in released versions.  What
cases did you find that still change vs. 9.3?

I was trying to build a spreadsheet of every combination of these
options. It turns out that 4-dimensional spreadsheets are kind of
awkward.

What's one query that still behaves differently in 9.5 vs. 9.3 (under
formatting options that exist in both versions)?

I think the fundamental dilemma was the same that was discussed
column for the wrap indicators then all expanded modes should have
that column to be consistent since wrapping shouldn't change the

I might agree for a greenfield design, but -1 for changing expanded mode now
to improve consistency in this way.  I predict the complaints from users of
expanded mode in automation would overpower any applause for the consistency.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] replication commands and log_statements

```On Wed, Aug 20, 2014 at 2:06 AM, Robert Haas robertmh...@gmail.com wrote:

On Sat, Aug 16, 2014 at 10:27 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
I think ideally it would have been better if we could have logged
replication commands under separate log_level, but as still there
is no consensus on extending log_statement and nobody is even
willing to pursue, it seems okay to go ahead and log these under
'all' level.

I think the consensus is clearly for a separate GUC.

+1.
--
Michael

```

### Re: [HACKERS] Hokey wrong versions of libpq in apt.postgresql.org

```

On 08/19/2014 08:34 AM, Craig Ringer wrote:

I iterate, the current apt.postgresql.org is not doing things correctly.
It breaks things and it shouldn't.

FWIW, this is inconsistent with what yum.postgresql.org does - it takes
the POLA approach of packing the libpq from the major release configured
in the repo. Each major has its own sub-repo.

I find it pretty hard to justify installing a 9.3 libpq alongside a 9.1
server myself.

Exactly.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
If we send our children to Caesar for their education, we should
not be surprised when they come back as Romans.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Verbose output of pg_dump not show schema name

```On Fri, Jul 25, 2014 at 4:45 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

Given this is a very small and simple patch I thought it's not necessary...

I had a look at this patch, and here are a couple of comments:
1) Depending on how ArchiveEntry is called to register an object to
dump, namespace may be NULL, but it is not the case
namespace-dobj.name, so you could get the namespace name at the top
of the function that have their verbose output improved with something
like that:
const char *namespace = tbinfo-dobj.namespace ?
tbinfo-dobj.namespace-dobj.name : NULL;
And then simplify the message output as follows:
if (namespace)
write_msg(blah \%s\.\%s\ blah, namespace, classname);
else
write_msg(blah \%s\ blah, classname);
You can as well safely remove the checks on namespace-dobj.name.
2) I don't think that this is correct:
-   ahlog(AH, 1, processing data
for table \%s\\n,
- te-tag);
+   ahlog(AH, 1, processing data
for table \%s\.\%s\\n,
+ AH-currSchema, te-tag);
There are some code paths where AH-currSchema is set to NULL, and I
think that you should use te-namespace instead.
3) Changing only this message is not enough. The following verbose
messages need to be changed too for consistency:
- pg_dump: creating \$tag \$object
- pg_dump: setting owner and privileges for [blah]

I have been pondering as well about doing similar modifications to the
error message paths, but it did not seem worth it as this patch is
aimed only for the verbose output. Btw, I have basically fixed those
issues while doing the review, and finished with the attached patch.
Fabrizio, is this new version fine for you?
Regards,
--
Michael
From e0809869655c9e22cce11955c7286cef8a42bf1d Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Wed, 20 Aug 2014 14:40:40 +0900
Subject: [PATCH] Improve verbose messages of pg_dump with namespace

Namespace is added to the verbose output when it is available, relation
and namespace names are put within quotes for clarity and consistency
with the other tools as well.
---
src/bin/pg_dump/pg_backup_archiver.c | 26 ---
src/bin/pg_dump/pg_dump.c| 85 ++--
2 files changed, 93 insertions(+), 18 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 3aebac8..07cc10e 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -546,8 +546,13 @@ RestoreArchive(Archive *AHX)
/* Both schema and data objects might now have ownership/ACLs */
if ((te-reqs  (REQ_SCHEMA | REQ_DATA)) != 0)
{
-			ahlog(AH, 1, setting owner and privileges for %s %s\n,
-  te-desc, te-tag);
+			/* Show namespace if available */
+			if (te-namespace)
+ahlog(AH, 1, setting owner and privileges for %s \%s\.\%s\\n,
+	  te-desc, te-namespace, te-tag);
+			else
+ahlog(AH, 1, setting owner and privileges for %s \%s\\n,
+	  te-desc, te-tag);
_printTocEntry(AH, te, ropt, false, true);
}
}
@@ -621,7 +626,13 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te,

if ((reqs  REQ_SCHEMA) != 0)		/* We want the schema */
{
-		ahlog(AH, 1, creating %s %s\n, te-desc, te-tag);
+		/* Show namespace if available */
+		if (te-namespace)
+			ahlog(AH, 1, creating %s \%s\.\%s\\n,
+  te-desc, te-namespace, te-tag);
+		else
+			ahlog(AH, 1, creating %s \%s\\n, te-desc, te-tag);
+

_printTocEntry(AH, te, ropt, false, false);
defnDumped = true;
@@ -713,8 +724,13 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
_becomeOwner(AH, te);
_selectOutputSchema(AH, te-namespace);

-	ahlog(AH, 1, processing data for table \%s\\n,
-		  te-tag);
+	/* Show namespace if available */
+	if (te-namespace)
+		ahlog(AH, 1, processing data for table \%s\.\%s\\n,
+			  te-namespace, te-tag);
+	else
+		ahlog(AH, 1, processing data for table \%s\\n,
+			  te-tag);

/*
* In parallel restore, if we created the table earlier in
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5c0f95f..dd7eef9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1383,6 +1383,8 @@ dumpTableData_copy(Archive *fout, void *dcontext)
{
TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
TableInfo  *tbinfo = tdinfo-tdtable;
+	const char *namespace = tbinfo-dobj.namespace ?
+		tbinfo-dobj.namespace-dobj.name : NULL;
const char *classname = tbinfo-dobj.name;
const bool	hasoids = tbinfo-hasoids;
const bool	oids = tdinfo-oids;
@@ -1400,7 +1402,16 @@ dumpTableData_copy(Archive *fout, void *dcontext)
const char *column_list;

if (g_verbose)
-		write_msg(NULL, dumping contents of table %s\n, classname);
+	{
+		/* Print namespace information if ```