Re: [HACKERS] POC: Cache data in GetSnapshotData()

2017-08-28 Thread Mithun Cy
Cache the SnapshotData for reuse:
===
In one of our perf analysis using perf tool it showed GetSnapshotData
takes a very high percentage of CPU cycles on readonly workload when
there is very high number of concurrent connections >= 64.

Machine : cthulhu 8 node machine.
--
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):128
On-line CPU(s) list:   0-127
Thread(s) per core:2
Core(s) per socket:8
Socket(s): 8
NUMA node(s):  8
Vendor ID: GenuineIntel
CPU family:6
Model: 47
Model name:Intel(R) Xeon(R) CPU E7- 8830  @ 2.13GHz
Stepping:  2
CPU MHz:   2128.000
BogoMIPS:  4266.63
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

Perf CPU cycle 128 clients.

On further analysis, it appears this mainly accounts to cache-misses
happening while iterating through large proc array to compute the
SnapshotData. Also, it seems mainly on read-only (read heavy) workload
SnapshotData mostly remains same as no new(rarely a) transaction
commits or rollbacks. Taking advantage of this we can save the
previously computed SanspshotData in shared memory and in
GetSnapshotData we can use the saved SnapshotData instead of computing
same when it is still valid. A saved SnapsotData is valid until no
transaction end.

[Perf analysis Base]

Samples: 421K of event 'cache-misses', Event count (approx.): 160069274
Overhead  Command   Shared Object   Symbol
18.63%  postgres  postgres[.] GetSnapshotData
11.98%  postgres  postgres[.] _bt_compare
10.20%  postgres  postgres[.] PinBuffer
  8.63%  postgres  postgres[.] LWLockAttemptLock
6.50%  postgres  postgres[.] LWLockRelease


[Perf analysis with Patch]


Server configuration:
./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c
wal_buffers=256MB &

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S  postgres

The machine has 64 cores with this patch I can see server starts
improvement after 64 clients. I have tested up to 256 clients. Which
shows performance improvement nearly max 39% [1]. This is the best
case for the patch where once computed snapshotData is reused further.

The worst case seems to be small, quick write transactions with which
frequently invalidate the cached SnapshotData before it is reused by
any next GetSnapshotData call. As of now, I tested simple update
tests: pgbench -M Prepared -N on the same machine with the above
server configuration. I do not see much change in TPS numbers.

All TPS are median of 3 runs
Clients TPS-With Patch 05   TPS-Base%Diff
1 752.461117755.186777  -0.3%
64   32171.296537   31202.153576   +3.1%
128 41059.660769   40061.929658   +2.49%

I will do some profiling and find out why this case is not costing us
some performance due to caching overhead.


[]

On Thu, Aug 3, 2017 at 2:16 AM, Andres Freund  wrote:
> Hi,
>
> I think this patch should have a "cover letter" explaining what it's
> trying to achieve, how it does so and why it's safe/correct.  I think
> it'd also be good to try to show some of the worst cases of this patch
> (i.e. where the caching only adds overhead, but never gets used), and
> some of the best cases (where the cache gets used quite often, and
> reduces contention significantly).
>
> I wonder if there's a way to avoid copying the snapshot into the cache
> in situations where we're barely ever going to need it. But right now
> the only way I can think of is to look at the length of the
> ProcArrayLock wait queue and count the exclusive locks therein - which'd
> be expensive and intrusive...
>
>
> On 2017-08-02 15:56:21 +0530, Mithun Cy wrote:
>> diff --git a/src/backend/storage/ipc/procarray.c 
>> b/src/backend/storage/ipc/procarray.c
>> index a7e8cf2..4d7debe 100644
>> --- a/src/backend/storage/ipc/procarray.c
>> +++ b/src/backend/storage/ipc/procarray.c
>> @@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid)
>>   (arrayP->numProcs - index - 1) * 
>> sizeof(int));
>>   arrayP->pgprocnos[arrayP->numProcs - 1] = -1;   /* for 
>> debugging */
>>   

Re: [HACKERS] Adding hook in BufferSync for backup purposes

2017-08-28 Thread Andrey Borodin
Hi hackers!

> 8 авг. 2017 г., в 11:27, Tom Lane  написал(а):
> 
> My point is not to claim that we mustn't put a hook there.  It's that what
> such a hook could safely do is tightly constrained, and you've not offered
> clear evidence that there's something useful to be done within those
> constraints.  Alvaro seems to think that the result might be better
> reached by hooking in at other places, and my gut reaction is similar.
> 


I was studying through work done by Marco and Gabriel on the matter of tracking 
pages for the incremental backup, and I have found PTRACK patch by Yury 
Zhuravlev and PostgresPro [0]. This work seems to be solid and thoughtful. I 
have drafted a new prototype for hooks enabling incremental backup as extension 
based on PTRACK calls.

If you look at the original patch you can see that attached patch differs 
slightly: functionality to track end of critical section is omitted. I have not 
included it in this draft because it changes very sensitive place even for 
those who do not need it.

Subscriber of proposed hook must remember that it is invoked under critical 
section. But there cannot me more than XLR_MAX_BLOCK_ID blocks for one 
transaction. Proposed draft does not add any functionality to track finished 
transactions or any atomic unit of work, just provides a flow of changed block 
numbers. Neither does this draft provide any assumption on where to store this 
information. I suppose subscribers could trigger asynchronous writes somewhere 
as long as info for given segment is accumulated (do we need a hook on segment 
end then?). During inremental backup you can skip scanning those WAL segments 
for which you have accumulated changeset of block numbers.

The thing which is not clear to me: if we are accumulating blocknumbers under 
critical section, then we must place them to preallocated array. When is the 
best time to take away these blocknumbers to empty that array and avoid 
overflow? PTRACK has array of XLR_MAX_BLOCK_ID length and saves these array 
during the end of each critical section. But I want to avoid intervention into 
critical sections.

Thank you for your attention, any thoughts will be appreciated.

Best regards, Andrey Borodin.

[0] https://gist.github.com/stalkerg/ab833d94e2f64df241f1835651e06e4b


0001-hooks-to-watch-for-changed-pages.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

2017-08-28 Thread Dilip Kumar
On Wed, Aug 23, 2017 at 9:45 AM, Dilip Kumar  wrote:
>>

>> ...
>> if (tbm->nentries <= tbm->maxentries / 2)
>> {
>> /*
>>  * We have made enough room.
>> ...
>> I think we could try higher fill factor, say, 0.9. tbm_lossify basically
>> just continues iterating over the hashtable with not so much overhead per a
>> call, so calling it more frequently should not be a problem. On the other
>> hand, it would have to process less pages, and the bitmap would be less
>> lossy.
>>
>> I didn't benchmark the index scan per se with the 0.9 fill factor, but the
>> reduction of lossy pages was significant.
>
> I will try this and produce some performance number.
>

I have done some performance testing as suggested by Alexander (patch attached).

Performance results:  I can see a significant reduction in lossy_pages
count in all the queries and also a noticeable reduction in the
execution time in some of the queries.  I have tested with 2 different
work_mem. Below are the test results (lossy pages count and execution
time).


TPCH benchmark: 20 scale factor
Machine: Power 4 socket
Tested with max_parallel_worker_per_gather=0

Work_mem: 20 MB

(Lossy Pages count:)
Query head  patch

4   166551  35478
5330679  35765
6   1160339  211357
14  666897  103275
15 1160518 211544
20  1982981  405903


(Time in ms:)
Queryhead   patch

414849 14093
576790 74486
625816 14327
14   16011 11093
15   5138135326
20  25   195501


Work_mem: 40 MB
(Lossy Pages count)

Queryhead  patch

6  995223195681
14337894  75744
15 995417   195873
20   1654016   199113


(Time in ms)
Queryhead  patch

6   2381914571
14 1351411183
15 49980 32400
20204441   188978

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


lossify_slow.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-08-28 Thread Michael Paquier
On Sat, Aug 26, 2017 at 8:00 AM, Robert Haas  wrote:
> What happens if you say VACUUM partitioned_table (a), some_partition (b)?

Using v9, if you do that:
=# CREATE TABLE parent (id int) PARTITION BY RANGE (id);
CREATE TABLE
=# CREATE TABLE child_1_10001 partition of parent for values from (1)
to (10001);
CREATE TABLE
=# CREATE TABLE child_10001_20001 partition of parent for values from
(10001) to (20001);
CREATE TABLE
=# insert into parent values (generate_series(1,2));
INSERT 0 2

Vacuuming the parent vacuums all the children, so any child listed
would get vacuumed twice, still this does not cause an error:
=# vacuum parent, child_10001_2;
VACUUM
And with the de-duplication patch on top of it, things are vacuumed only once.

On Tue, Aug 29, 2017 at 7:56 AM, Bossart, Nathan  wrote:
> On 8/23/17, 11:59 PM, "Michael Paquier"  wrote:
>> + * relations is a list of VacuumRelations to process.  If it is NIL, all
>> + * relations in the database are processed.
>> Typo here, VacuumRelation is singular.
>
> This should be fixed in v9.
>
> On 8/24/17, 5:45 PM, "Michael Paquier"  wrote:
>> This makes me think that it could be a good idea to revisit this bit
>> in a separate patch. ANALYZE fails as well now when the same column is
>> defined multiple times with an incomprehensible error message.
>
> The de-duplication code is now in a separate patch,
> dedupe_vacuum_relations_v1.patch.  I believe it fixes the incomprehensible
> error message you were experiencing, but please let me know if you are
> still hitting it.

It looks that problems in this area are fixed using the second patch.

> On 8/25/17, 6:00 PM, "Robert Haas"  wrote:
>> +oldcontext = MemoryContextSwitchTo(vac_context);
>> +foreach(lc, relations)
>> +temp_relations = lappend(temp_relations, copyObject(lfirst(lc)));
>> +MemoryContextSwitchTo(oldcontext);
>> +relations = temp_relations;
>>
>> Can't we just copyObject() on the whole list?
>
> I've made this change.
>
>> -ListCell   *cur;
>> -
>>
>> Why change this?  Generally, declaring a separate variable in an inner
>> scope seems like better style than reusing one that happens to be
>> lying around in the outer scope.
>
> I've removed this change.
>
>> +VacuumRelation *relinfo = (VacuumRelation *) lfirst(lc);
>>
>> Could use lfirst_node.
>
> Done.
>
> On 8/28/17, 5:28 PM, "Michael Paquier"  wrote:
>> Yes, if I understand that correctly. That's the point I am exactly
>> coming at. My suggestion is to have one VacuumRelation entry per
>> relation vacuumed, even for partitioned tables, and copy the list of
>> columns to each one.
>
> I've made this change in v9.  It does clean up the patch quite a bit.

Here is some input for vacuum_multiple_tables_v9, about which I think
that we are getting to something committable. Here are some minor
comments.

   
-   With no parameter, VACUUM processes every table in the
+   With no parameters, VACUUM processes every table in the
current database that the current user has permission to vacuum.
-   With a parameter, VACUUM processes only that table.
+   With parameters, VACUUM processes only the tables
+   specified.
   
The part about parameters looks fine to me if unchanged.

+   foreach(lc, relations)
+   {
+   VacuumRelation *relation = lfirst_node(VacuumRelation, lc);
+   if (relation->va_cols != NIL && (options & VACOPT_ANALYZE) == 0)
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg("ANALYZE option must be specified when a
column list is provided")));
+   }
Could you add a hint with the relation name involved here? When many
relations are defined in the VACUUM query this would be useful for the
user.

+   tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+   if (!HeapTupleIsValid(tuple))
+   elog(ERROR, "cache lookup failed for relation %u", relid);
+   classForm = (Form_pg_class) GETSTRUCT(tuple);
+   include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+   ReleaseSysCache(tuple);
It pains me to see that get_rel_relkind does not return an error if
the relation is missing, we could use it here. I would welcome a
refactoring with a missing_ok argument a lot! Now this patch for
VACUUM does not justify breaking potentially many extensions...

+   relinfo = makeNode(VacuumRelation);
+   rel_oid = HeapTupleGetOid(tuple);
+   relinfo->oid = rel_oid;
There are 4 patterns like that in the patch. We could make use of a
makeVacuumRelation.

About the de-duplication patch, I have to admit that I am still not a
fan of doing such a thing. Another road that we could take is to
simply complain with a proper error message if:
- the same column name is specified twice for a relation.
- the same relation 

Re: [HACKERS] NoMovementScanDirection in heapgettup() and heapgettup_pagemode()

2017-08-28 Thread Mithun Cy
On Mon, Aug 28, 2017 at 5:50 PM, Tom Lane  wrote:
> I think that's probably dead code given that ExecutorRun short-circuits
> everything for NoMovementScanDirection.  There is some use of
> NoMovementScanDirection for indexscans, to denote an unordered index,
> but likely that could be got rid of too.  That would leave us with
> NoMovementScanDirection having no other use except as a do-nothing
> flag for ExecutorRun.
>
> regards, tom lane

Thanks, Tom, yes it seems to be a dead code.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show "aggressive" or not in autovacuum logs

2017-08-28 Thread David G. Johnston
On Mon, Aug 28, 2017 at 2:26 AM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> https://www.postgresql.org/docs/devel/static/runtime-config-client.html
>
> >
> ​V​
> ACUUM performs an aggressive scan
>

​Maybe this should gets its own thread/patch but I'll tack this on here
since it all seems related.

​That paragraph you linked has a couple of typos:

"Although users can set this value anywhere from zero to two billions,
VACUUM" ...

should be 'two billion' (i.e., drop the "s")

"...so that a periodical manual VACUUM has..."

'periodic' - though the description in the linked 24.1.5 is somewhat
clearer (and longer) - the gap exists for the benefit of routine vacuum
invocations to detect the need for an aggressive vacuum as part of a normal
operating cycle rather than the last routine vacuum being non-aggressive
and shortly thereafter an auto-vacuum anti-wraparound run is performed.

Current:

VACUUM will silently limit the effective value to 95% of
autovacuum_freeze_max_age, so that a periodical manual VACUUM has a chance
to run before an anti-wraparound autovacuum is launched for the table.

My interpretation:

VACUUM will silently limit the effective value to 95% of
autovacuum_freeze_max_age so that a normal scan has a window within which
to detect the need to convert itself to an aggressive scan and preempt the
need for an untimely autovacuum initiated anti-wraparound scan.


As noted in the 24.1.5 that "normal scan" can be time scheduled or an
update driven auto-vacuum one.  It could be manual (though not really
periodic) if one is monitoring the aging and is notified to run on manually
when the age falls within the gap.

"Aggressive" sounds right throughout all of this.

Up-thread:
INFO:  vacuuming "public.it" in aggressive mode

Maybe:
INFO:  aggressively vacuuming "public.it"
INFO:  vacuuming "public.it"

Likewise:
LOG:  automatic vacuum of table "postgres.public.pgbench_branches": mode:
aggressive, index scans: 0

could be:
LOG:  automatic aggressive vacuum of table
"postgres.public.pgbench_branches", index scans: 0

Having read the docs and come to understand what "aggressive" means two
wordings work for me (i.e., leaving non-aggressive unadorned).

David J.


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 10:17 PM, Tom Lane  wrote:
> In the meantime, I think what we should do is commit the bug fix more or
> less as I have it, and then work on Amit's concern about losing parallel
> efficiency by separating the resetting of shared parallel-scan state
> into a new plan tree traversal that's done before launching new worker
> processes.  The only real alternative is to lobotomize the existing rescan
> optimizations, and that seems like a really poor choice from here.

There's already ExecParallelReinitialize, which could be made to walk
the nodes in addition to what it does already, but I don't understand
exactly what else needs fixing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hash partitioning based on v10Beta2

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 10:44 PM, yangjie  wrote:
> When the number of partitions and the data are more, adding new partitions,
> there will be some efficiency problems.
> I don't know how the solution you're talking about is how to implement a
> hash partition?

I am having difficulty understanding this.  There was discussion on
the other thread of how splitting partitions could be done reasonably
efficiently with the proposed design; of course, it's never going to
be super-cheap.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-08-28 Thread Tom Lane
Michael Paquier  writes:
> I don't like breaking the abstraction of pg_log() with the existing
> flags with some kind of pg_debug() layer. The set of APIs present now
> in pg_rewind for logging is nice to have, and I think that those debug
> messages should be translated. So what about the attached?

Your point about INT64_FORMAT not necessarily working with fprintf
is an outstanding reason not to keep it like it is.  I've not reviewed
this patch in detail but I think this is basically the way to fix it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-08-28 Thread Michael Paquier
On Mon, Aug 28, 2017 at 11:24 PM, Robert Haas  wrote:
> On Mon, Aug 28, 2017 at 10:16 AM, Alvaro Herrera
>  wrote:
>>> I am fine with however you want to handle it, but it seems odd to me
>>> that we don't have a way of embedding INT64_FORMAT in a translatable
>>> string.  Surely that's going to be a problem in some case, sometime,
>>> isn't it?
>>
>> The way we do that elsewhere is to print out the value to a string
>> variable and then use %s in the translatable message.
>
> Hmm.  OK.  That doesn't sound great, but if there's no better option...

I don't like breaking the abstraction of pg_log() with the existing
flags with some kind of pg_debug() layer. The set of APIs present now
in pg_rewind for logging is nice to have, and I think that those debug
messages should be translated. So what about the attached?
-- 
Michael


rewind-int64-log.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Tom Lane
Robert Haas  writes:
> On Mon, Aug 28, 2017 at 6:35 PM, Tom Lane  wrote:
>> Maybe parallel_aware should have more than two values, depending
>> on whether the result of the node is context-dependent or not.

> It seems likely the whole concept of parallel_aware is only only a
> zero-order approximation to what we really want.

Yeah, I agree --- but it's also clear that we don't yet know what it
should be.  We'll have to work that out as we accrete more functionality.

In the meantime, I think what we should do is commit the bug fix more or
less as I have it, and then work on Amit's concern about losing parallel
efficiency by separating the resetting of shared parallel-scan state
into a new plan tree traversal that's done before launching new worker
processes.  The only real alternative is to lobotomize the existing rescan
optimizations, and that seems like a really poor choice from here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 5:24 AM, Kyotaro HORIGUCHI
 wrote:
> This patch have had interferences from several commits after the
> last submission. I amended this patch to follow them (up to
> f97c55c), removed an unnecessary branch and edited some comments.

I think the core problem for this patch is that there's no consensus
on what approach to take.  Until that somehow gets sorted out, I think
this isn't going to make any progress.  Unfortunately, I don't have a
clear idea what sort of solution everybody could tolerate.

I still think that some kind of slow-expire behavior -- like a clock
hand that hits each backend every 10 minutes and expires entries not
used since the last hit -- is actually pretty sensible.  It ensures
that idle or long-running backends don't accumulate infinite bloat
while still allowing the cache to grow large enough for good
performance when all entries are being regularly used.  But Tom
doesn't like it.  Other approaches were also discussed; none of them
seem like an obvious slam-dunk.

Turning to the patch itself, I don't know how we decide whether the
patch is worth it.  Scanning the whole (potentially large) cache to
remove negative entries has a cost, mostly in CPU cycles; keeping
those negative entries around for a long time also has a cost, mostly
in memory.  I don't know how to decide whether these patches will help
more people than it hurts, or the other way around -- and it's not
clear that anyone else has a good idea about that either.

Typos: funciton, paritial.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improving overflow checks when adding tuple to PGresult Re: [GENERAL] Retrieving query results

2017-08-28 Thread Michael Paquier
On Mon, Aug 28, 2017 at 3:05 PM, Michael Paquier
 wrote:
> Attached are two patches:
> 1) 0001 refactors the code around pqAddTuple to be able to handle
> error messages and assign them in PQsetvalue particularly.
> 2) 0002 adds sanity checks in pqAddTuple for overflows, maximizing the
> size of what is allocated to INT_MAX but now more.
>
> pqRowProcessor() still has errmsgp, but it is never used on HEAD. At
> least with this set of patches it comes to be useful. We could rework
> check_field_number() to use as well an error message string, but I
> have left that out to keep things simple. Not sure if any complication
> is worth compared to just copying the error message in case of an
> unmatching column number.

As this change requires I think an extra lookup, I am moving the
discussion to -hackers with a proper subject and the set of patches
attached (and the test program). This patch is registered in the next
commit fest.
-- 
Michael


0001-Refactor-error-message-handling-in-pqAddTuple.patch
Description: Binary data


0002-Improve-overflow-checks-of-pqAddTuple-in-libpq.patch
Description: Binary data
/*
 * Script to test PQcopyResult and subsequently PQsetvalue.
 * Compile with for example:
 * gcc -lpq -g -o pg_copy_res pg_copy_res.c
 */

#include 
#include 
#include "libpq-fe.h"

#define DEFAULT_PORT	"5432"
#define DEFAULT_HOST	"/tmp"
#define DEFAULT_DB		"postgres"

int
main()
{
	char *port = getenv("PGPORT");
	char *host = getenv("PGHOST");
	char *dbname = getenv("PGDATABASE");
	char connstr[512];
	PGconn *conn;
	PGresult *res, *res_copy;

	if (port == NULL)
		port = DEFAULT_PORT;
	if (host == NULL)
		host = DEFAULT_HOST;
	if (dbname == NULL)
		dbname = DEFAULT_DB;

	snprintf(connstr, sizeof(connstr), "port=%s host=%s dbname=%s",
			 port, host, dbname);

	conn = PQconnectdb(connstr);

	if (PQstatus(conn) != CONNECTION_OK)
	{
		fprintf(stderr, "Connection to database failed: %s",
PQerrorMessage(conn));
		return 1;
	}

	res = PQexec(conn, "SELECT 1");

	/* Copy the resuld wanted, who care what that is... */
	res_copy = PQcopyResult(res, PG_COPYRES_TUPLES | PG_COPYRES_ATTRS);

	PQclear(res);
	PQclear(res_copy);

	PQfinish(conn);
	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-28 Thread Masahiko Sawada
On Mon, Aug 28, 2017 at 4:59 PM, Fabien COELHO  wrote:
>
> Hello Masahiko-san,
>
> Patch applies cleanly, compiles, works for me.

Thank you for reviewing!

>
>>> At least: "Required to invoke" -> "Require to invoke".
>>
>>
>> Fixed.
>
>
> I meant the added one about -I, not the pre-existing one about -i.

Fixed.

>>> About the code:
>>>
>>> is_no_vacuum should be a bool?
>>
>>
>> We can change it but I think there is no difference actually. So
>> keeping it would be better.
>
>
> I would like to insist a little bit: the name was declared as an int but
> passed to init and used as a bool there before the patch. Conceptually what
> is meant is really a bool, and I see no added value bar saving a few strokes
> to have an int. ISTM that recent pgbench changes have started turning old
> int-for-bool habits into using bool when bool is meant.

Since is_no_vacuum is a existing one, if we follow the habit we should
change other similar variables as well: is_init_mode,
do_vacuum_accounts and debug. And I think we should change them in a
separated patch.

>
> initialize_cmds initialization: rather use pg_strdup instead of
> pg_malloc/strcpy?

Fixed.

> -I: pg_free before pg_strdup to avoid a small memory leak?

Fixed.

> I'm not sure I would have bothered with sizeof(char), but why not!
>
> I'm still a little bit annoyed by checkCustomCmds: when unhappy, it shows an
> error message and return false, which immediatly results in exit(1). However
> the pattern elsewhere in pgbench is to show the error and exit immediatly. I
> would suggest to simplify by void-ing the function and exiting instead of
> returning false.

Agreed, fixed.

After more thought, I'm bit inclined to not have a short option for
--custom-initialize because this option will be used for not primary
cases. It would be better to save short options for future
enhancements of pgbench. Thought?

Attached latest patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgbench_custom_initialization_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom allocators in libpq

2017-08-28 Thread Craig Ringer
On 29 August 2017 at 05:15, Tom Lane  wrote:

> Peter Eisentraut  writes:
> > On 8/28/17 15:11, Tom Lane wrote:
> >> ... but it seems like you're giving up a lot of the possible uses if
> >> you don't make it apply uniformly.  I admit I'm not sure how we'd handle
> >> the initial creation of a connection object with a custom malloc.  The
> >> obvious solution of requiring the functions to be specified at PQconnect
> >> time seems to require Yet Another PQconnect Variant, which is not very
> >> appetizing.
>
> > I would have expected a separate function just to register the callback
> > functions, before doing anything else with libpq.  Similar to libxml:
> > http://xmlsoft.org/xmlmem.html
>
> I really don't much care for libxml's solution, because it implies
> global variables, with the attendant thread-safety issues.  That's
> especially bad if you want a passthrough such as a memory context
> pointer, since it's quite likely that different call sites would
> need different passthrough values, even assuming that a single set
> of callback functions would suffice for an entire application.
> That latter assumption isn't so pleasant either.  One could expect
> that by using such a solution, postgres_fdw could be expected to
> break, say, a libpq-based DBI library inside plperl.


Yeah, the 'register a malloc() function pointer in a global via a
registration function call' approach seems fine and dandy until you find
yourself with an app that, via shared library loads, has more than one
different user of libpq with its own ideas about memory allocation.

RTLD_LOCAL can help, but may introduce other issues.

So there doesn't seem much way around another PQconnect variant. Yay? We
could switch to a struct-passing argument model, but by the time you add
the necessary "nfields" argument to allow you to know how much of the
struct you can safely access, etc, just adding new connect functions starts
to look good in comparison.

Which reminds me, it kind of stinks that PQconnectdbParams and PQpingParams
accept key and value char* arrays, but PQconninfoParse produces
a PQconninfoOption* . This makes it seriously annoying when you want to
parse a connstring, make some transformations and pass it to a connect
function. I pretty much always just put the user's original connstring in
'dbname' and set expand_dbname = true instead.

It might make sense to have any new function accept PQconninfoOption*. Or a
variant of PQconninfoParse that populates k/v arrays with 'n' extra fields
allocated and zeroed on return, I guess.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] show "aggressive" or not in autovacuum logs

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 5:26 AM, Kyotaro HORIGUCHI
 wrote:
> Currently the message shows the '%d skipped-frozen' message but
> it is insufficient to verify the true effect. This is a patch to
> show mode as 'aggressive' or 'normal' in the closing message of
> vacuum. %d frozen-skipped when 'aggressive mode' shows the true
> effect of ALL_FROZEN.
>
> I will add this patch to CF2017-09.

I would be a bit inclined to somehow show aggressive if it's
aggressive and not insert anything at all otherwise.  That'd probably
require two separate translatable strings in each case, but maybe
that's OK.

What do other people think?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write operations in parallel mode

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 12:23 PM, Antonin Houska  wrote:
> Now that dynamic shared memory hash table has been committed
> (8c0d7bafad36434cb08ac2c78e69ae72c194ca20) I wonder if it's still a big deal
> to remove restrictions like this in (e.g. heap_update()):
>
> /*
>  * Forbid this during a parallel operation, lest it allocate a 
> combocid.
>  * Other workers might need that combocid for visibility checks, and 
> we
>  * have no provision for broadcasting it to them.
>  */
> if (IsInParallelMode())
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
>  errmsg("cannot update tuples during a
> parallel operation")));

Well, it certainly doesn't solve that problem directly, but I think it
is useful infrastructure that lends itself to some kind of a solution
to that problem.  There are other problems, too, like group locking
vs. the relation extension lock, though Masahiko-san posted a patch
for that IIRC, and also group locking vs. GIN page locks, and also
proper interlocking of updates.  I think that the order in which we
should tackle things is probably something like:

1. Make inserts parallel-restricted rather than parallel-unsafe - i.e.
allow inserts but only in the leader.  This allows plans like Insert
-> Gather -> whatever.  I'm not sure there's a lot to do here other
than allow such plans to be generated.

2. Make updates and deletes parallel-restricted rather than
parallel-unsafe - i.e. allow updates and deletes but only in the
leader.  This similarly would allow Update -> Gather -> whatever and
Delete -> Gather -> whatever.  For this, you'd need a shared combo CID
hash so that workers can learn about new combo CIDs created by the
leader.

3. Make inserts parallel-safe rather than parallel-restricted.  This
allows plans like Gather -> Insert -> whatever, which is way better
than Insert -> Gather -> whatever.  If there's no RETURNING, the
Gather isn't actually gathering anything.  This requires sorting out
some group locking issues, but not tuple locking.

4. Make updates and deletes parallel-safe rather than
parallel-restricted.  Now tuple locking has to be sorted out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 6:35 PM, Tom Lane  wrote:
> but probably we should think of a more arm's-length way to do it.
> Maybe parallel_aware should have more than two values, depending
> on whether the result of the node is context-dependent or not.

My original intent for the parallel_aware flag was for it to signal
whether the plan node was going to do something functionally different
when in parallel mode.  For scans, that's come to mean "partition the
input among the workers", and there doesn't seem to be any other
sensible meaning.  I don't have a good idea what it's going to mean
for non-scan nodes yet.  Parallel Hash will be the first non-parallel
aware scan node, and it uses it to mean that the hash table in dynamic
shared memory, so that the inner side can be partial (which is
otherwise not possible).  I'm guessing that is going to be a common
meaning for nodes that store stuff - it's easy to imagine Parallel
Materialize, Parallel Sort, Parallel HashAggregate with similar
semantics.  There's also a proposed patch for Parallel Append where it
signals that DSM is being used to coordinate task scheduling and load
balancing.

It seems likely the whole concept of parallel_aware is only only a
zero-order approximation to what we really want.  This bug is, IMHO,
the first really tangible evidence of the concept creaking around the
edges, but I've kind of had a feeling for a while that it wasn't
likely to be problem-free.  I'm still not sure exactly what the right
answer will turn out to be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?

2017-08-28 Thread Bruce Momjian
On Thu, Jul  6, 2017 at 07:29:07AM -0700, Jim Finnerty wrote:
> re: "The problem is if you want to delete from such a page.  Then you need to
> update the tuple's xmax and stick the new xid epoch somewhere."

I am coming to this very late, but wouldn't such a row be marked using
our frozen-commited fixed xid so it doesn't matter what the xid epoch is?
I realize with 64-bit xids we don't need to freeze tuples, but we could
still use a frozen-commited fixed xid, see:

#define FrozenTransactionId ((TransactionId) 2)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Tom Lane
Robert Haas  writes:
> On Mon, Aug 28, 2017 at 3:00 PM, Tom Lane  wrote:
>> Count the "that"s (you're failing to notice the next line).

> OK, true.  But "Academic literature" -> "The academic literature" is
> just second-guessing, I think.

No, it was more to avoid reflowing the paragraph (or leaving a weirdly
short line).

> There should never be a parallel_aware node that's not beneath a
> Gather or Gather Merge; I don't know what the meaning of such a plan
> would be, so I think we're safe against such a thing appearing in the
> future.  What I'm unclear about is what happens with nodes that aren't
> directly in the chain between the Gather and the parallel-aware node.

Nothing.  The parallel-aware node(s) get marked as dependent on the rescan
parameter, and then that marking bubbles up to their ancestor nodes (up
to the Gather).  Nodes that are not ancestral to any parallel-aware node
are unchanged.

> Now consider Parallel Hash
> (not yet committed), where we might get this:

> Something
> -> Gather
>   -> Merge Join
> -> Sort
>   -> Parallel Seq Scan on a
> -> Hash Join
>   -> Index Scan on b
>   -> Parallel Hash
> -> Parallel Seq Scan on c

> Now what?  We clearly still need to force a re-sort of a, but what
> about the shared hash table built on c?  If we've still got that hash
> table and it was a single-batch join, there's probably no need to
> rescan it even though both the Parallel Hash node and the Parallel Seq
> Scan beneath it are parallel-aware.  In this case all workers
> collaborated to build a shared hash table covering all rows from c; if
> we've still got that, it's still good.  In fact, even the workers can
> reuse that hash table even though, for them, it'll not really be a
> re-scan at all.

Well, I'd say that's something for the parallel hash patch to resolve.
Yes, if the contents of the hash table are expected to be the same
regardless of how many workers were involved, then we shouldn't need
to rebuild it after a rescan of the Gather.  That would mean not marking
either Parallel Hash or its descendant Parallel Seq Scan as dependent
on the Gather's rescan param.  That's not terribly hard to mechanize
within the structure of this patch: just ignore the param at/below
the ParallelHash.  Cowboy coding would be, perhaps,

if (plan->parallel_aware)
{
if (gather_param < 0)
elog(ERROR, "parallel-aware plan node is not below a Gather");
+   if (IsA(plan, Hash))
+   gather_param = -1;
+   else
context.paramids =
bms_add_member(context.paramids, gather_param);
}

but probably we should think of a more arm's-length way to do it.
Maybe parallel_aware should have more than two values, depending
on whether the result of the node is context-dependent or not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-08-28 Thread Michael Paquier
On Sat, Aug 26, 2017 at 8:00 AM, Robert Haas  wrote:
> On Thu, Aug 24, 2017 at 12:59 AM, Michael Paquier
>  wrote:
>> Robert, Amit and other folks working on extending the existing
>> partitioning facility would be in better position to answer that, but
>> I would think that we should have something as flexible as possible,
>> and storing a list of relation OID in each VacuumRelation makes it
>> harder to track the uniqueness of relations vacuumed. I agree that the
>> concept of a partition with multiple parents induces a lot of
>> problems. But the patch as proposed worries me as it complicates
>> vacuum() with a double loop: one for each relation vacuumed, and one
>> inside it with the list of OIDs. Modules calling vacuum() could also
>> use flexibility, being able to analyze a custom list of columns for
>> each relation has value as well.
>
> So ... why have a double loop?  I mean, you could just expand this out
> to one entry per relation actually being vacuumed, couldn't you?

Yes, if I understand that correctly. That's the point I am exactly
coming at. My suggestion is to have one VacuumRelation entry per
relation vacuumed, even for partitioned tables, and copy the list of
columns to each one.

> +oldcontext = MemoryContextSwitchTo(vac_context);
> +foreach(lc, relations)
> +temp_relations = lappend(temp_relations, copyObject(lfirst(lc)));
> +MemoryContextSwitchTo(oldcontext);
> +relations = temp_relations;
>
> Can't we just copyObject() on the whole list?

Yup.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 3:00 PM, Tom Lane  wrote:
>> That sentence isn't wrong as written.
>
> Count the "that"s (you're failing to notice the next line).

OK, true.  But "Academic literature" -> "The academic literature" is
just second-guessing, I think.

>> I don't really understand the part about depending on a parallel-aware
>> node.  I mean, there should always be one, except in the
>> single-copy-Gather case, but why is it right to depend on that rather
>> than anything else?  What happens when the Parallel Hash patch goes in
>> and we have multiple parallel-aware scan nodes (plus a parallel-aware
>> Hash node) under the same Gather?
>
> Well, that's what I'm asking.  AFAICS we only really need the scan node(s)
> to be marked as depending on the Gather's rescan parameter.  It would not,
> however, hurt anything for nodes above them to be so marked as well ---
> and even if we didn't explicitly mark them, those nodes would end up
> depending on the parameter anyway because of the way that parameter
> dependency propagation works.  I think the question boils down to whether
> a "parallel_aware" node would ever not be underneath a related Gather.

There should never be a parallel_aware node that's not beneath a
Gather or Gather Merge; I don't know what the meaning of such a plan
would be, so I think we're safe against such a thing appearing in the
future.  What I'm unclear about is what happens with nodes that aren't
directly in the chain between the Gather and the parallel-aware node.
For instance:

Something
-> Gather
  -> Merge Join
-> Sort
  -> Parallel Seq Scan on a
-> Merge Join
  -> Sort
-> Seq Scan on b
  -> Sort
-> Seq Scan on c

If the Gather gets rescanned, is it OK to force a re-sort of a but not
of b or c?  Hmm, maybe so.  The workers are going to have to do the
sorts of b and c since any workers from a previous scan are GONE, but
if the leader's done that work, it's still good.  Similarly:

Something
-> Gather
  -> Merge Join
-> Sort
  -> Parallel Seq Scan on a
-> Hash Join
  -> Index Scan on b
  -> Hash
-> Seq Scan on c

If the leader's got an existing hash table built on c, it can reuse
it.  The workers will need to build one.  Now consider Parallel Hash
(not yet committed), where we might get this:

Something
-> Gather
  -> Merge Join
-> Sort
  -> Parallel Seq Scan on a
-> Hash Join
  -> Index Scan on b
  -> Parallel Hash
-> Parallel Seq Scan on c

Now what?  We clearly still need to force a re-sort of a, but what
about the shared hash table built on c?  If we've still got that hash
table and it was a single-batch join, there's probably no need to
rescan it even though both the Parallel Hash node and the Parallel Seq
Scan beneath it are parallel-aware.  In this case all workers
collaborated to build a shared hash table covering all rows from c; if
we've still got that, it's still good.  In fact, even the workers can
reuse that hash table even though, for them, it'll not really be a
re-scan at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom allocators in libpq

2017-08-28 Thread Tom Lane
Peter Eisentraut  writes:
> On 8/28/17 15:11, Tom Lane wrote:
>> ... but it seems like you're giving up a lot of the possible uses if
>> you don't make it apply uniformly.  I admit I'm not sure how we'd handle
>> the initial creation of a connection object with a custom malloc.  The
>> obvious solution of requiring the functions to be specified at PQconnect
>> time seems to require Yet Another PQconnect Variant, which is not very
>> appetizing.

> I would have expected a separate function just to register the callback
> functions, before doing anything else with libpq.  Similar to libxml:
> http://xmlsoft.org/xmlmem.html

I really don't much care for libxml's solution, because it implies
global variables, with the attendant thread-safety issues.  That's
especially bad if you want a passthrough such as a memory context
pointer, since it's quite likely that different call sites would
need different passthrough values, even assuming that a single set
of callback functions would suffice for an entire application.
That latter assumption isn't so pleasant either.  One could expect
that by using such a solution, postgres_fdw could be expected to
break, say, a libpq-based DBI library inside plperl.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom allocators in libpq

2017-08-28 Thread Peter Eisentraut
On 8/28/17 15:11, Tom Lane wrote:
> ... but it seems like you're giving up a lot of the possible uses if
> you don't make it apply uniformly.  I admit I'm not sure how we'd handle
> the initial creation of a connection object with a custom malloc.  The
> obvious solution of requiring the functions to be specified at PQconnect
> time seems to require Yet Another PQconnect Variant, which is not very
> appetizing.

I would have expected a separate function just to register the callback
functions, before doing anything else with libpq.  Similar to libxml:
http://xmlsoft.org/xmlmem.html

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom allocators in libpq

2017-08-28 Thread Aaron Patterson
On Mon, Aug 28, 2017 at 03:11:26PM -0400, Tom Lane wrote:
> Aaron Patterson  writes:
> > I would like to be able to configure libpq with custom malloc functions.
> 
> I can see the potential value of this ...
> 
> > This patch doesn't replace all malloc calls to the configured ones, just
> > the mallocs related to creating result objects (which is what I'm
> > concerned with).
> 
> ... but it seems like you're giving up a lot of the possible uses if
> you don't make it apply uniformly.

I'm happy to make the changes uniformly!  I'll do that and update the
patch.

> I admit I'm not sure how we'd handle
> the initial creation of a connection object with a custom malloc.  The
> obvious solution of requiring the functions to be specified at PQconnect
> time seems to require Yet Another PQconnect Variant, which is not very
> appetizing.

Other libraries I've worked with allow me to malloc a struct, then pass
it to an initialization function.  This might take a bit of refactoring,
like introducing a new `PQconnectStart`, but might be worth while.

> I also wonder whether you wouldn't want a passthrough argument.
> For instance, one of the use-cases that springs to mind immediately is
> teaching postgres_fdw and dblink to use this so that their result objects
> are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead.
> While I suppose we could have the hook functions always allocate in
> CurrentMemoryContext, it'd likely be useful to be able to specify
> "use context X" at creation time.

We don't need this for the Ruby wrapper, but I've seen other libraries
do it.  I'm happy to add it as well.

Thanks!

-- 
Aaron Patterson
http://tenderlovemaking.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 6:38 AM, Amit Langote
 wrote:
> I am worried about the open relcache reference in PartitionDispatch when
> we start using it in the planner.  Whereas there is a ExecEndModifyTable()
> as a suitable place to close that reference, there doesn't seem to exist
> one within the planner, but I guess we will have to figure something out.

Yes, I think there's no real way to avoid having to figure that out.

> OK, done this way in the attached updated patch.  Any suggestions about a
> better name for what the patch calls PartitionTupleRoutingInfo?

I think this patch could be further simplified by continuing to use
the existing function signature for RelationGetPartitionDispatchInfo
instead of changing it to return a List * rather than an array.  I
don't see any benefit to such a change.  The current system is more
efficient.

I keep having the feeling that this is a big patch with a small patch
struggling to get out.  Is it really necessary to change
RelationGetPartitionDispatchInfo so much or could you just do a really
minimal surgery to remove the code that sets the stuff we don't need?
Like this:

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 96a64ce6b2..4fabcf9f32 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1089,29 +1089,7 @@ RelationGetPartitionDispatchInfo(Relation rel,
 pd[i] = (PartitionDispatch) palloc(sizeof(PartitionDispatchData));
 pd[i]->reldesc = partrel;
 pd[i]->key = partkey;
-pd[i]->keystate = NIL;
 pd[i]->partdesc = partdesc;
-if (parent != NULL)
-{
-/*
- * For every partitioned table other than root, we must store a
- * tuple table slot initialized with its tuple descriptor and a
- * tuple conversion map to convert a tuple from its parent's
- * rowtype to its own. That is to make sure that we are looking at
- * the correct row using the correct tuple descriptor when
- * computing its partition key for tuple routing.
- */
-pd[i]->tupslot = MakeSingleTupleTableSlot(tupdesc);
-pd[i]->tupmap = convert_tuples_by_name(RelationGetDescr(parent),
-   tupdesc,
-
gettext_noop("could not convert row type"));
-}
-else
-{
-/* Not required for the root partitioned table */
-pd[i]->tupslot = NULL;
-pd[i]->tupmap = NULL;
-}
 pd[i]->indexes = (int *) palloc(partdesc->nparts * sizeof(int));

 /*


-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom allocators in libpq

2017-08-28 Thread Tom Lane
Aaron Patterson  writes:
> I would like to be able to configure libpq with custom malloc functions.

I can see the potential value of this ...

> This patch doesn't replace all malloc calls to the configured ones, just
> the mallocs related to creating result objects (which is what I'm
> concerned with).

... but it seems like you're giving up a lot of the possible uses if
you don't make it apply uniformly.  I admit I'm not sure how we'd handle
the initial creation of a connection object with a custom malloc.  The
obvious solution of requiring the functions to be specified at PQconnect
time seems to require Yet Another PQconnect Variant, which is not very
appetizing.

I also wonder whether you wouldn't want a passthrough argument.
For instance, one of the use-cases that springs to mind immediately is
teaching postgres_fdw and dblink to use this so that their result objects
are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead.
While I suppose we could have the hook functions always allocate in
CurrentMemoryContext, it'd likely be useful to be able to specify
"use context X" at creation time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Tom Lane
Robert Haas  writes:
> On Mon, Aug 28, 2017 at 10:47 AM, Tom Lane  wrote:
> - fuller description.  Academic literature on parallel query suggests that
> + fuller description.  The academic literature on parallel query suggests

> That sentence isn't wrong as written.

Count the "that"s (you're failing to notice the next line).

> I don't really understand the part about depending on a parallel-aware
> node.  I mean, there should always be one, except in the
> single-copy-Gather case, but why is it right to depend on that rather
> than anything else?  What happens when the Parallel Hash patch goes in
> and we have multiple parallel-aware scan nodes (plus a parallel-aware
> Hash node) under the same Gather?

Well, that's what I'm asking.  AFAICS we only really need the scan node(s)
to be marked as depending on the Gather's rescan parameter.  It would not,
however, hurt anything for nodes above them to be so marked as well ---
and even if we didn't explicitly mark them, those nodes would end up
depending on the parameter anyway because of the way that parameter
dependency propagation works.  I think the question boils down to whether
a "parallel_aware" node would ever not be underneath a related Gather.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 10:47 AM, Tom Lane  wrote:
> That seems like an unacceptably fragile assumption.  Even if it happens to
> be true today, we would need to fix it sooner or later.  (And I kinda
> suspect it's possible to break it today, anyway.  Treating PARAM_EXEC
> Params as parallel-restricted seems to lock out the easiest cases, but we
> have param slots that don't correspond to any Param node, eg for recursive
> union worktables.  replace_nestloop_params is also a source of PARAM_EXEC
> Params that won't be detected during is_parallel_safe() tests, because it
> happens later.)

Those particular cases are, I think, handled.  The CTE case is handled
by considering CTE scans as parallel-restricted, and the nestloop case
is handled by insisting that all partial paths must be
unparameterized.  You can join a partial path to a parameterized
non-partial path to make a new partial path, but neither the original
partial path nor the resulting one can itself be parameterized.

- fuller description.  Academic literature on parallel query suggests that
+ fuller description.  The academic literature on parallel query suggests

That sentence isn't wrong as written.

I don't really understand the part about depending on a parallel-aware
node.  I mean, there should always be one, except in the
single-copy-Gather case, but why is it right to depend on that rather
than anything else?  What happens when the Parallel Hash patch goes in
and we have multiple parallel-aware scan nodes (plus a parallel-aware
Hash node) under the same Gather?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Custom allocators in libpq

2017-08-28 Thread Aaron Patterson
Hello!

I would like to be able to configure libpq with custom malloc functions.
The reason is that we have a Ruby wrapper that exposes libpq in Ruby.
The problem is that Ruby's GC doesn't know how much memory has been
allocated by libpq, so no pressure is applied to the GC when it should
be.  Ruby exports malloc functions that automatically apply GC pressure,
and I'd like to be able to configure libpq to use those malloc
functions.

I've attached two patches that add this functionality.  The first patch
introduces a new function `PQunescapeByteaConn` which takes a
connection (so we have a place to get the malloc functions).  We already
have `PQescapeBytea` and `PQescapeByteaConn`, this first patch gives us
the analogous `PQunescapeBytea` and `PQunescapeByteaConn`.

The second patch adds malloc function pointer fields to `PGEvent`,
`pg_result`, and `pg_conn` structs, and changes libpq internals to use
those allocators rather than directly calling `malloc`.

This patch doesn't replace all malloc calls to the configured ones, just
the mallocs related to creating result objects (which is what I'm
concerned with).

If there's something I'm missing, please let me know.  This is my first
patch to libpq, so I look forward to hearing feedback.  Thanks for your
time!

-- 
Aaron Patterson
http://tenderlovemaking.com/
>From be463a2da13734d5b104a5fc9c58e7bbb8908323 Mon Sep 17 00:00:00 2001
From: Aaron Patterson 
Date: Thu, 19 Nov 2015 14:44:49 -0800
Subject: [PATCH 1/2] add PQunescapeByteaConn for unescaping bytes given a
 connection

This commit adds an unescape method that takes a connection as a
parameter.  The connection may have settings on it (like allocators)
that are important to unescaping the string.
---
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-exec.c   | 21 +++--
 src/interfaces/libpq/libpq-fe.h  |  6 --
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index d6a38d0df8..ba718f4071 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -172,3 +172,4 @@ PQsslAttribute169
 PQsetErrorContextVisibility 170
 PQresultVerboseErrorMessage 171
 PQencryptPasswordConn 172
+PQunescapeByteaConn   173
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index e1e2d18e3a..464bdc635f 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -3651,8 +3651,9 @@ PQescapeBytea(const unsigned char *from, size_t 
from_length, size_t *to_length)
  * \ooo == a byte whose value = ooo (ooo is an octal number)
  * \x   == x (x is any character not matched by the above 
transformations)
  */
-unsigned char *
-PQunescapeBytea(const unsigned char *strtext, size_t *retbuflen)
+static unsigned char *
+PQunescapeByteaInternal(PGconn *conn,
+ const unsigned char *strtext, size_t 
*retbuflen)
 {
size_t  strtextlen,
buflen;
@@ -3762,3 +3763,19 @@ PQunescapeBytea(const unsigned char *strtext, size_t 
*retbuflen)
*retbuflen = buflen;
return tmpbuf;
 }
+
+unsigned char *
+PQunescapeByteaConn(PGconn *conn,
+ const unsigned char *strtext, size_t 
*retbuflen)
+{
+   if (!conn)
+   return NULL;
+
+   return PQunescapeByteaInternal(conn, strtext, retbuflen);
+}
+
+unsigned char *
+PQunescapeBytea(const unsigned char *strtext, size_t *retbuflen)
+{
+   return PQunescapeByteaInternal(NULL, strtext, retbuflen);
+}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 1d915e7915..7fce4b387e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -527,13 +527,15 @@ extern char *PQescapeIdentifier(PGconn *conn, const char 
*str, size_t len);
 extern unsigned char *PQescapeByteaConn(PGconn *conn,
  const unsigned char *from, size_t from_length,
  size_t *to_length);
-extern unsigned char *PQunescapeBytea(const unsigned char *strtext,
-   size_t *retbuflen);
+extern unsigned char *PQunescapeByteaConn(PGconn *conn,
+  const unsigned char *strtext, size_t 
*retbuflen);
 
 /* These forms are deprecated! */
 extern size_t PQescapeString(char *to, const char *from, size_t length);
 extern unsigned char *PQescapeBytea(const unsigned char *from, size_t 
from_length,
  size_t *to_length);
+extern unsigned char *PQunescapeBytea(const unsigned char *strtext,
+   size_t *retbuflen);
 
 
 
-- 
2.11.0

>From b62fed4a7e38ef9e2057d5556312a5a76cdf Mon Sep 17 00:00:00 2001
From: Aaron Patterson 
Date: Thu, 19 Nov 2015 15:28:27 -0800
Subject: [PATCH 2/2] use configured malloc / 

[HACKERS] Write operations in parallel mode

2017-08-28 Thread Antonin Houska
Now that dynamic shared memory hash table has been committed
(8c0d7bafad36434cb08ac2c78e69ae72c194ca20) I wonder if it's still a big deal
to remove restrictions like this in (e.g. heap_update()):

/*
 * Forbid this during a parallel operation, lest it allocate a combocid.
 * Other workers might need that combocid for visibility checks, and we
 * have no provision for broadcasting it to them.
 */
if (IsInParallelMode())
ereport(ERROR,
(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
 errmsg("cannot update tuples during a
parallel operation")));



-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain : log queries with wrong estimation

2017-08-28 Thread Adrien Nayrat
On 08/24/2017 03:08 PM, Maksim Milyutin wrote:
[...]
> 
> AFAICS you want to introduce two additional per-node variables:
>  - auto_explain_log_estimate_ratio that denotes minimum ratio (>= 1) between
> real value and planned one. I would add 'min' prefix before 'ratio'.
>  - auto_explain_log_estimate_min_rows - minimum absolute difference between
> those two values. IMHO this name is somewhat poor, the suffix 'min_diff_rows'
> looks better.
> If real expressions (ratio and diff) exceed these threshold values both, you 
> log
> this situation. I'm right?

Yes, you're totaly right! I wonder if "ratio" is fine or if I should use 
"factor"?

[...]
> 
> Instrumentation is initialized only with analyze (log_analyze is true)[1]

Good, I didn't notice instrumentation can be enabled in auto_explain's hook. I
added these lines and it works :

if (auto_explain_log_estimate_ratio || auto_explain_log_estimate_min_rows)
{
 queryDesc->instrument_options |= INSTRUMENT_ROWS;
}

But I need to undestand how instrumentation works.

Thanks for your answer. I will continue my work, actually my patch is not
functionnal.


-- 
Adrien NAYRAT




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Aug 28, 2017 at 6:34 PM, Tom Lane  wrote:
>> If what you're complaining about is that I put back the "if
>> (outerPlan->chgParam == NULL)" test to allow postponement of the
>> recursive ExecReScan call, I'm afraid that it's mere wishful
>> thinking that omitting that test in nodeGather did anything.

> Previously outerPlan->chgParam will be NULL, so I think rescan's won't
> be postponed.

That seems like an unacceptably fragile assumption.  Even if it happens to
be true today, we would need to fix it sooner or later.  (And I kinda
suspect it's possible to break it today, anyway.  Treating PARAM_EXEC
Params as parallel-restricted seems to lock out the easiest cases, but we
have param slots that don't correspond to any Param node, eg for recursive
union worktables.  replace_nestloop_params is also a source of PARAM_EXEC
Params that won't be detected during is_parallel_safe() tests, because it
happens later.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist

2017-08-28 Thread Ryan Murphy
> Thanks for reporting it!
>

My pleasure!

So the initial issue didn't happen the 2nd time.  So if misc_sanity was the
only test
failing then I guess my tests are working fine other than that.  Sweet!

When I get a break from work I'll review some patches!

Best,
Ryan


Re: [HACKERS] hash partitioning based on v10Beta2

2017-08-28 Thread Robert Haas
On Sat, Aug 26, 2017 at 12:40 AM, yang...@highgo.com  wrote:
> A partition table can be create as bellow:
>
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;

This syntax is very problematic for reasons that have been discussed
on the existing hash partitioning thread.  Fortunately, a solution has
already been implemented... you're the third person to try to write a
patch for this feature.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 10:16 AM, Alvaro Herrera
 wrote:
>> I am fine with however you want to handle it, but it seems odd to me
>> that we don't have a way of embedding INT64_FORMAT in a translatable
>> string.  Surely that's going to be a problem in some case, sometime,
>> isn't it?
>
> The way we do that elsewhere is to print out the value to a string
> variable and then use %s in the translatable message.

Hmm.  OK.  That doesn't sound great, but if there's no better option...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-08-28 Thread Alvaro Herrera
Robert Haas wrote:
> On Mon, Aug 28, 2017 at 9:05 AM, Alvaro Herrera
>  wrote:
> > Robert Haas wrote:
> >> pg_rewind: Fix some problems when copying files >2GB.
> >
> > I just noticed that this broke pg_rewind translation, because of the
> > INT64_FORMAT marker in the translatable string.  The message catalog now
> > has this:
> >
> > msgid "received chunk for file \"%s\", offset "
> >
> > for this source line:
> >
> > pg_log(PG_DEBUG, "received chunk for file \"%s\", offset " 
> > INT64_FORMAT ", size %d\n",
> >filename, chunkoff, chunksize);
> >
> > I don't think that this is terribly valuable as a translatable string,
> > so my preferred fix would be to have a new function pg_debug() here
> > where the messages are *not* marked for translations.
> 
> I am fine with however you want to handle it, but it seems odd to me
> that we don't have a way of embedding INT64_FORMAT in a translatable
> string.  Surely that's going to be a problem in some case, sometime,
> isn't it?

The way we do that elsewhere is to print out the value to a string
variable and then use %s in the translatable message.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-08-28 Thread Robert Haas
On Mon, Aug 28, 2017 at 9:05 AM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> pg_rewind: Fix some problems when copying files >2GB.
>
> I just noticed that this broke pg_rewind translation, because of the
> INT64_FORMAT marker in the translatable string.  The message catalog now
> has this:
>
> msgid "received chunk for file \"%s\", offset "
>
> for this source line:
>
> pg_log(PG_DEBUG, "received chunk for file \"%s\", offset " 
> INT64_FORMAT ", size %d\n",
>filename, chunkoff, chunksize);
>
> I don't think that this is terribly valuable as a translatable string,
> so my preferred fix would be to have a new function pg_debug() here
> where the messages are *not* marked for translations.

I am fine with however you want to handle it, but it seems odd to me
that we don't have a way of embedding INT64_FORMAT in a translatable
string.  Surely that's going to be a problem in some case, sometime,
isn't it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Amit Kapila
On Mon, Aug 28, 2017 at 6:34 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Mon, Aug 28, 2017 at 6:01 PM, Tom Lane  wrote:
>>> Um, what's different about that than before?
>
>> Earlier, we perform the rescan of all the nodes before ExecProcNode,
>> so workers will always start (restart) after the scan descriptor is
>> initialized.
>
> If what you're complaining about is that I put back the "if
> (outerPlan->chgParam == NULL)" test to allow postponement of the
> recursive ExecReScan call, I'm afraid that it's mere wishful
> thinking that omitting that test in nodeGather did anything.
> The nodes underneath the Gather are likely to do the same thing,
> so that the parallel table scan node itself is going to get a
> postponed rescan call anyway.  See e.g. ExecReScanNestLoop().
>

Previously outerPlan->chgParam will be NULL, so I think rescan's won't
be postponed.  IIRC, I have debugged it during the development of this
code that rescans were not postponed.  I don't deny that for some
cases it might get delayed but for simple cases, it was done
immediately.  I agree that in general, the proposed fix is better than
having nothing, but not sure if we need it for Gather as well
considering we are not able to demonstrate any case.

> I see your point that there's inadequate interlocking between resetting
> the parallel scan's shared state and starting a fresh set of workers,
> but that's a pre-existing bug.  In practice I doubt it makes any
> difference, because according to my testing the leader will generally
> reach the table scan long before any workers do.  It'd be nice to do
> better though.
>

Agreed.  BTW, I have mentioned above that we can avoid skipping
optimization in rescan path if we are in parallel mode.  I think that
will not be as elegant a solution as your patch, but it won't have
this problem.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist

2017-08-28 Thread Tom Lane
Ryan Murphy  writes:
> I did notice that the test seems to create a ROLE called regress_ecpg_user2:

Right.

> Could that implicitly create a database too?  I know that I somehow have a
> database named after my username / postgres role "murftown".

Maybe you've got some tool somewhere that automatically creates databases
for users?  PG itself doesn't.

> I've attached the new regression.diffs.

Oh, huh, that's actually a bug in the misc_sanity test --- it is legit for
at least some rows in pg_shdepend to have zero dbid.  That doesn't happen
when working in an empty installation, but yours evidently isn't.  Thanks
for reporting it!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist

2017-08-28 Thread Ryan Murphy
> No, you're reading it backwards: the error is expected, but it's not
> appearing in your results.  I can duplicate this if I manually create
> database "regress_ecpg_user2" before running ecpg's installcheck,
> so I guess that's what you did.  I can find no evidence that any
> part of the PG regression tests creates such a database.
>

Thanks, that makes sense.  However, when I go into my database
with psql and type `drop database regress_ecpg_user2;`, the
response is "ERROR:  database "regress_ecpg_user2" does not exist".
So it seems either the tests are creating it somehow and then cleaning
it up (though I agree that I found no obvious evidence of that in the
codebase), or could I be looking in the wrong postgres install?

I think it's the same install though.  I have my postgres installing into an
install_dir/ directory.  Here's how I run configure:

#!/bin/sh
CFLAGS='-O0' ./configure \
--prefix=path/to/postgres/install_dir/ \
--with-uuid=e2fs \
--enable-debug \
--with-perl

I did notice that the test seems to create a ROLE called regress_ecpg_user2:

$ git grep -n 'regress_ecpg_user2'
src/interfaces/ecpg/test/Makefile:78:REGRESS_OPTS =
--dbname=ecpg1_regression,ecpg2_regression
--create-role=regress_ecpg_user1,regress_ecpg_user2 $(EXTRA_REGRESS_OPTS)
...

Could that implicitly create a database too?  I know that I somehow have a
database named after my username / postgres role "murftown".

I ran "make installcheck-world" again, and, the result is different this
time -
a test called "misc_sanity" failed early on in the tests:

$ make installcheck-world
make -C src/test installcheck
make -C perl installcheck
make[2]: Nothing to be done for `installcheck'.
make -C regress installcheck
...
== dropping database "regression" ==
DROP DATABASE
== creating database "regression" ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test tablespace   ... ok
test boolean  ... ok
test char ... ok
...
test regex... ok
test oidjoins ... ok
test type_sanity  ... ok
test opr_sanity   ... ok
test misc_sanity  ... FAILED
test comments ... ok
test expressions  ... ok
test insert   ... ok
test insert_conflict  ... ok
test create_function_1... ok
...

The old failure with the missing error message is gone from
regression.diffs this time.
I've attached the new regression.diffs.

Best,
Ryan


regression2.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hash partitioning based on v10Beta2

2017-08-28 Thread yang...@highgo.com
On Sat, Aug 26, 2017 at 10:10 AM, yang...@highgo.com 
wrote:

> Hi all,
>
> Now we have had the range / list partition, but hash partitioning is not
> implemented yet.
> Attached is a POC patch based on the v10Beta2 to add the
> hash partitioning feature.
> Although we will need more discussions about the syntax
> and other specifications before going ahead the project,
> but I think this runnable code might help to discuss
> what and how we implement this.
>
>
FYI, there is already an existing commitfest entry for this project.

https://commitfest.postgresql.org/14/1059/



> Description
>
> The hash partition's implement is on the basis of
> the original range / list partition,and using similar syntax.
>
> To create a partitioned table ,use:
>
> CREATE TABLE h (id int) PARTITION BY HASH(id);
>
> The partitioning key supports only one value, and I think
> the partition key can support multiple values,
> which may be difficult to implement when querying, but
> it is not impossible.
>
> A partition table can be create as bellow:
>
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>
> FOR VALUES clause cannot be used, and the partition bound
> is calclulated automatically as partition index of single integer value.
>
> An inserted record is stored in a partition whose index equals
> DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0],
> TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts
> /* Number of partitions */
> ;
> In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_
> cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;
>
> postgres=# insert into h select generate_series(1,20);
> INSERT 0 20
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id
> --+
>  h1   |  3
>  h1   |  5
>  h1   | 17
>  h1   | 19
>  h2   |  2
>  h2   |  6
>  h2   |  7
>  h2   | 11
>  h2   | 12
>  h2   | 14
>  h2   | 15
>  h2   | 18
>  h2   | 20
>  h3   |  1
>  h3   |  4
>  h3   |  8
>  h3   |  9
>  h3   | 10
>  h3   | 13
>  h3   | 16
> (20 rows)
>
> The number of partitions here can be dynamically added, and
> if a new partition is created, the number of partitions
> changes, the calculated target partitions will change,
>  and the same data is not reasonable in different
> partitions,So you need to re-calculate the existing data
> and insert the target partition when you create a new partition.
>
> postgres=# create table h4 partition of h;
> CREATE TABLE
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id
> --+
>  h1   |  5
>  h1   | 17
>  h1   | 19
>  h1   |  6
>  h1   | 12
>  h1   |  8
>  h1   | 13
>  h2   | 11
>  h2   | 14
>  h3   |  1
>  h3   |  9
>  h3   |  2
>  h3   | 15
>  h4   |  3
>  h4   |  7
>  h4   | 18
>  h4   | 20
>  h4   |  4
>  h4   | 10
>  h4   | 16
> (20 rows)
>
> When querying the data, the hash partition uses the same
> algorithm as the insertion, and filters out the table
> that does not need to be scanned.
>
> postgres=# explain analyze select * from h where id = 1;
>  QUERY PLAN
>
> 
> 
>  Append  (cost=0.00..41.88 rows=13 width=4) (actual time=
> 0.020..0.023 rows=1 loops=1)
>->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (
> actual time=0.013..0.016 rows=1 loops=1)
>  Filter: (id = 1)
>  Rows Removed by Filter: 3
>  Planning time: 0.346 ms
>  Execution time: 0.061 ms
> (6 rows)
>
> postgres=# explain analyze select * from h where id in (1,5);;
>  QUERY PLAN
>
> 
> 
>  Append  (cost=0.00..83.75 rows=52 width=4) (actual time=
> 0.016..0.028 rows=2 loops=1)
>->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (
> actual time=0.015..0.018 rows=1 loops=1)
>  Filter: (id = ANY ('{1,5}'::integer[]))
>  Rows Removed by Filter: 6
>->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (
> actual time=0.005..0.007 rows=1 loops=1)
>  Filter: (id = ANY ('{1,5}'::integer[]))
>  Rows Removed by Filter: 3
>  Planning time: 0.720 ms
>  Execution time: 0.074 ms
> (9 rows)
>
> postgres=# explain analyze select * from h where id = 1 or id = 5;;
>  QUERY PLAN
>
> 
> 
>  Append  (cost=0.00..96.50 rows=50 width=4) (actual time=
> 0.017..0.078 rows=2 loops=1)
>->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (
> actual time=0.015..0.019 rows=1 loops=1)
>  Filter: ((id = 1) OR (id 

Re: [HACKERS] [POC] hash partitioning

2017-08-28 Thread yang...@highgo.com
Hello

Looking at your hash partitioning syntax, I implemented a hash partition in a 
more concise way, with no need to determine the number of sub-tables, and 
dynamically add partitions.

Description

The hash partition's implement is on the basis of the original range / list 
partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can 
support multiple values, 
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

 CREATE TABLE h1 PARTITION OF h;
 CREATE TABLE h2 PARTITION OF h;
 CREATE TABLE h3 PARTITION OF h;
 
FOR VALUES clause cannot be used, and the partition bound is calclulated 
automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */
;
In the above example, this is 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
--+
 h1   |  3
 h1   |  5
 h1   | 17
 h1   | 19
 h2   |  2
 h2   |  6
 h2   |  7
 h2   | 11
 h2   | 12
 h2   | 14
 h2   | 15
 h2   | 18
 h2   | 20
 h3   |  1
 h3   |  4
 h3   |  8
 h3   |  9
 h3   | 10
 h3   | 13
 h3   | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition 
is created, the number of partitions changes, the calculated target partitions 
will change, and the same data is not reasonable in different partitions,So you 
need to re-calculate the existing data and insert the target partition when you 
create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
--+
 h1   |  5
 h1   | 17
 h1   | 19
 h1   |  6
 h1   | 12
 h1   |  8
 h1   | 13
 h2   | 11
 h2   | 14
 h3   |  1
 h3   |  9
 h3   |  2
 h3   | 15
 h4   |  3
 h4   |  7
 h4   | 18
 h4   | 20
 h4   |  4
 h4   | 10
 h4   | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the 
insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
 QUERY PLAN 


 Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 
loops=1)
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual 
time=0.013..0.016 rows=1 loops=1)
 Filter: (id = 1)
 Rows Removed by Filter: 3
 Planning time: 0.346 ms
 Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
 QUERY PLAN 


 Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 
loops=1)
   ->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual 
time=0.015..0.018 rows=1 loops=1)
 Filter: (id = ANY ('{1,5}'::integer[]))
 Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual 
time=0.005..0.007 rows=1 loops=1)
 Filter: (id = ANY ('{1,5}'::integer[]))
 Rows Removed by Filter: 3
 Planning time: 0.720 ms
 Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
 QUERY PLAN 


 Append  (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 
loops=1)
   ->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (actual 
time=0.015..0.019 rows=1 loops=1)
 Filter: ((id = 1) OR (id = 5))
 Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..48.25 rows=25 width=4) (actual 
time=0.005..0.010 rows=1 loops=1)
 Filter: ((id = 1) OR (id = 5))
 Rows Removed by Filter: 3
 Planning time: 0.396 ms
 Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young


yonj1e.github.io
yang...@highgo.com


[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.

2017-08-28 Thread Alvaro Herrera
Robert Haas wrote:
> pg_rewind: Fix some problems when copying files >2GB.

I just noticed that this broke pg_rewind translation, because of the
INT64_FORMAT marker in the translatable string.  The message catalog now
has this:

msgid "received chunk for file \"%s\", offset "

for this source line:

pg_log(PG_DEBUG, "received chunk for file \"%s\", offset " INT64_FORMAT 
", size %d\n",
   filename, chunkoff, chunksize);

I don't think that this is terribly valuable as a translatable string,
so my preferred fix would be to have a new function pg_debug() here
where the messages are *not* marked for translations.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Aug 28, 2017 at 6:01 PM, Tom Lane  wrote:
>> Um, what's different about that than before?

> Earlier, we perform the rescan of all the nodes before ExecProcNode,
> so workers will always start (restart) after the scan descriptor is
> initialized.

If what you're complaining about is that I put back the "if
(outerPlan->chgParam == NULL)" test to allow postponement of the
recursive ExecReScan call, I'm afraid that it's mere wishful
thinking that omitting that test in nodeGather did anything.
The nodes underneath the Gather are likely to do the same thing,
so that the parallel table scan node itself is going to get a
postponed rescan call anyway.  See e.g. ExecReScanNestLoop().

I see your point that there's inadequate interlocking between resetting
the parallel scan's shared state and starting a fresh set of workers,
but that's a pre-existing bug.  In practice I doubt it makes any
difference, because according to my testing the leader will generally
reach the table scan long before any workers do.  It'd be nice to do
better though.

I'm inclined to think that what's needed is to move the reset of the
shared state into a new "ExecParallelReInitializeDSM" plan tree walk,
which would have to occur before we start the new set of workers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Amit Kapila
On Mon, Aug 28, 2017 at 6:01 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> With this change, it is quite possible that during rescans workers
>> will not do any work.
>
> Um, what's different about that than before?
>

Earlier, we perform the rescan of all the nodes before ExecProcNode,
so workers will always start (restart) after the scan descriptor is
initialized.  Now, as evident in the discussion in this thread that
was not the right thing for gather merge as some of the nodes like
Sort does some optimization due to which rescan for the lower nodes
will never be called.  So, we need to ensure in some way that we don't
skip rescanning in such nodes and one way to achieve that is what you
have done in the patch, but it seems to have some side effect.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Tom Lane
Amit Kapila  writes:
> With this change, it is quite possible that during rescans workers
> will not do any work.

Um, what's different about that than before?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-28 Thread Amit Kapila
On Mon, Aug 28, 2017 at 1:59 AM, Tom Lane  wrote:
> I wrote:
>> I think that the correct fix probably involves marking each parallel scan
>> plan node as dependent on a pseudo executor parameter, which the parent
>> Gather or GatherMerge node would flag as being changed on each rescan.
>> This would cue the plan layers in between that they cannot optimize on the
>> assumption that the leader's instance of the parallel scan will produce
>> exactly the same rows as it did last time, even when "nothing else
>> changed".  The "wtParam" pseudo parameter that's used for communication
>> between RecursiveUnion and its descendant WorkTableScan node is a good
>> model for what needs to happen.
>
> Here is a draft patch for this.

! /*
! * Set child node's chgParam to tell it that the next scan might deliver a
! * different set of rows within the leader process.  (The overall rowset
! * shouldn't change, but the leader process's subset might; hence nodes
! * between here and the parallel table scan node mustn't optimize on the
! * assumption of an unchanging rowset.)
! */
! if (gm->rescan_param >= 0)
! outerPlan->chgParam = bms_add_member(outerPlan->chgParam,
! gm->rescan_param);
!
!
! /*
! * if chgParam of subnode is not null then plan will be re-scanned by
! * first ExecProcNode.
! */
! if (outerPlan->chgParam == NULL)
! ExecReScan(outerPlan);


With this change, it is quite possible that during rescans workers
will not do any work.  I think this will allow workers to launch
before rescan (for sequence scan) can reset the scan descriptor in the
leader which means that workers will still see the old value and
assume that the scan is finished and come out without doing any work.
Now, this won't produce wrong results because the leader will scan the
whole relation by itself in such a case, but it might be inefficient.

  It's a bit different from wtParam in
> that the special parameter isn't allocated until createplan.c time,
> so that we don't eat a parameter slot if we end up choosing a non-parallel
> plan; but otherwise things are comparable.
>
> I could use some feedback on whether this is marking dependent child nodes
> sanely.  As written, any plan node that's marked parallel_aware is assumed
> to need a dependency on the parent Gather or GatherMerge's rescan param
> --- and the planner will now bitch if a parallel_aware plan node is not
> under any such Gather.  Is this reasonable?

I think so.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NoMovementScanDirection in heapgettup() and heapgettup_pagemode()

2017-08-28 Thread Tom Lane
Mithun Cy  writes:
> I was trying to study NoMovementScanDirection part of heapgettup() and
> heapgettup_pagemode(). If I am right there is no test in test suit to
> hit this code. I did run make check-world could not hit it. Also,
> coverage report in
> https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html
> shows this part of the code is not hit. Can somebody please help me to
> understand this part of the code and how to test same?

I think that's probably dead code given that ExecutorRun short-circuits
everything for NoMovementScanDirection.  There is some use of
NoMovementScanDirection for indexscans, to denote an unordered index,
but likely that could be got rid of too.  That would leave us with
NoMovementScanDirection having no other use except as a do-nothing
flag for ExecutorRun.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] standby server crashes hard on out-of-disk-space in HEAD

2017-08-28 Thread Michael Paquier
On Tue, Jun 13, 2017 at 4:21 AM, Andres Freund  wrote:
> On 2017-06-12 15:12:23 -0400, Robert Haas wrote:
>> Commit 4b4b680c3d6d8485155d4d4bf0a92d3a874b7a65 (Make backend local
>> tracking of buffer pins memory efficient., vintage 2014) seems like a
>> likely culprit here, but I haven't tested.
>
> I'm not that sure. As written above, the Assert isn't new, and given
> this hasn't been reported before, I'm a bit doubtful that it's a general
> refcount tracking bug.  The FPI code has been whacked around more
> heavily, so it could well be a bug in it somewhere.

Something doing a bisect could just use a VM that puts the standby on
a tiny partition. I remember seeing this assertion failure some time
ago on a test deployment, and that was really surprising. I think that
this may be hiding something, so we should really try to investigate
more what's wrong here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Craig Ringer
On 28 August 2017 at 19:45, Tom Lane  wrote:

> Craig Ringer  writes:
> > It's a pain having to find the postmaster command line to get the port
> > pg_regress started a server on. We print the port in the pg_regress
> output,
> > why not the socket directory / host?
>
> I'm not following the point here.  The test postmaster isn't really
> going to be around long enough to connect to it manually.  If you
> want to do that, you should be using "installcheck", and then the
> problem doesn't arise.
>
> The reason for printing the port number, if memory serves, is to
> aid in debugging port selection conflicts.  That doesn't really
> apply for temporary socket directories; we're expecting libc to
> avoid any conflicts there.
>

I'm frequently debugging postmasters that are around long enough.
Deadlocks, etc.

It's also way easier to debug shmem related issues with a live postmaster
vs a core.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Tom Lane
Craig Ringer  writes:
> It's a pain having to find the postmaster command line to get the port
> pg_regress started a server on. We print the port in the pg_regress output,
> why not the socket directory / host?

I'm not following the point here.  The test postmaster isn't really
going to be around long enough to connect to it manually.  If you
want to do that, you should be using "installcheck", and then the
problem doesn't arise.

The reason for printing the port number, if memory serves, is to
aid in debugging port selection conflicts.  That doesn't really
apply for temporary socket directories; we're expecting libc to
avoid any conflicts there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist

2017-08-28 Thread Tom Lane
Ryan Murphy  writes:
> And when I look at that diffs file, this is what I see:

> - [NO_PID]: ECPGconnect: could not open database: FATAL:  database
> "regress_ecpg_user2" does not exist

> Why would this database not be getting created?

No, you're reading it backwards: the error is expected, but it's not
appearing in your results.  I can duplicate this if I manually create
database "regress_ecpg_user2" before running ecpg's installcheck,
so I guess that's what you did.  I can find no evidence that any
part of the PG regression tests creates such a database.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Fabien COELHO



I don't doubt about a sense of this configuration - but this specific
combination depends on usage - so I don't think so using special option is
good idea.


Although I agree with you that detailed settings are definitely debatable, 
I'd say that at least it would be a more reasonable starting point for 
scripting than the default configuration which is more interactive usage 
oriented.


So even if unperfect, one is free to update defaults to suit more closely 
their needs, eg "psql -B -F ':' ...", at least most of the scripting 
conviniencies are already there.


I think that such a scripting mode should also imply --no-readline.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

2017-08-28 Thread Michael Paquier
On Mon, Aug 28, 2017 at 8:02 PM, Kyotaro HORIGUCHI
 wrote:
> The first patch (0001-) fixes this problem, preventing the
> problematic state of WAL segments by retarding restart LSN of a
> physical replication slot in a certain condition.

FWIW, I have this patch marked on my list of things to look at, so you
can count me as a reviewer. There are also some approaches that I
would like to test because I rely on replication slots for some
infrastructure. Still...

+if (oldFlushPtr != InvalidXLogRecPtr &&
+(restartLSN == InvalidXLogRecPtr ?
+ oldFlushPtr / XLOG_SEG_SIZE != flushPtr / XLOG_SEG_SIZE :
+ restartLSN / XLOG_BLCKSZ != flushPtr / XLOG_BLCKSZ))
I find such code patterns not readable.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

2017-08-28 Thread Kyotaro HORIGUCHI
Hello,

This problem still occurs on the master.
I rebased this to the current master.

At Mon, 3 Apr 2017 08:38:47 +0900, Michael Paquier  
wrote in 
> On Mon, Apr 3, 2017 at 7:19 AM, Venkata B Nagothi  wrote:
> > As we are already past the commitfest, I am not sure, what should i change
> > the patch status to ?
> 
> The commit fest finishes on the 7th of April. Even with the deadline
> passed, there is nothing preventing to work on bug fixes. So this item
> ought to be moved to the next CF with the same category.

The steps to reproduce the problem follows.

- Apply the second patch (0002-) attached and recompile. It
  effectively reproduces the problematic state of database.

- M(aster): initdb the master with wal_keep_segments = 0
(default), log_min_messages = debug2
- M: Create a physical repslot.
- S(tandby): Setup a standby database.
- S: Edit recovery.conf to use the replication slot above then
 start it.
- S: touch /tmp/hoge
- M: Run pgbench ...
- S: After a while, the standby stops.
  > LOG:   STOP THE SERVER

- M: Stop pgbench.
- M: Do 'checkpoint;' twice.
- S: rm /tmp/hoge
- S: Fails to catch up with the following error.

  > FATAL:  could not receive data from WAL stream: ERROR:  requested WAL 
segment 0001002B has already been removed


The first patch (0001-) fixes this problem, preventing the
problematic state of WAL segments by retarding restart LSN of a
physical replication slot in a certain condition.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 3813599b74299f1da8d0567ed90542c5f35ed48b Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 1 Feb 2017 16:07:22 +0900
Subject: [PATCH 1/2] Retard restart LSN of a slot when a segment starts with a
 contrecord.

A physical-replication standby can stop just at the boundary of WAL
segments. restart_lsn of a slot on the master can be assumed to be the
same location. The last segment on the master will be removed after
some checkpoints for the case. If the last record of the last
replicated segment continues to the next segment, the continuation
record is only on the master. The standby cannot start in the case
because the split record is not available from only one source.

This patch detains restart_lsn in the last sgement when the first page
of the next segment is a continuation record.
---
 src/backend/replication/walsender.c | 105 +---
 1 file changed, 98 insertions(+), 7 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 03e1cf4..30c80af 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -217,6 +217,13 @@ static struct
 	WalTimeSample last_read[NUM_SYNC_REP_WAIT_MODE];
 }			LagTracker;
 
+/*
+ * This variable corresponds to restart_lsn in pg_replication_slots for a
+ * physical slot. This has a valid value only when it differs from the current
+ * flush pointer.
+ */
+static XLogRecPtr	   restartLSN = InvalidXLogRecPtr;
+
 /* Signal handlers */
 static void WalSndLastCycleHandler(SIGNAL_ARGS);
 
@@ -251,7 +258,7 @@ static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time);
 static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now);
 static bool TransactionIdInRecentPast(TransactionId xid, uint32 epoch);
 
-static void XLogRead(char *buf, XLogRecPtr startptr, Size count);
+static bool XLogRead(char *buf, XLogRecPtr startptr, Size count, bool noutfoundok);
 
 
 /* Initialize walsender process before entering the main command loop */
@@ -546,6 +553,9 @@ StartReplication(StartReplicationCmd *cmd)
 			ereport(ERROR,
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 	 (errmsg("cannot use a logical replication slot for physical replication";
+
+		/* Restore restartLSN from replication slot */
+		restartLSN = MyReplicationSlot->data.restart_lsn;
 	}
 
 	/*
@@ -561,6 +571,10 @@ StartReplication(StartReplicationCmd *cmd)
 	else
 		FlushPtr = GetFlushRecPtr();
 
+	/* Set InvalidXLogRecPtr if catching up */
+	if (restartLSN == FlushPtr)
+		restartLSN = InvalidXLogRecPtr;
+
 	if (cmd->timeline != 0)
 	{
 		XLogRecPtr	switchpoint;
@@ -770,7 +784,7 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req
 		count = flushptr - targetPagePtr;	/* part of the page available */
 
 	/* now actually read the data, we know it's there */
-	XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ);
+	XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ, false);
 
 	return count;
 }
@@ -1738,7 +1752,7 @@ static void
 ProcessStandbyReplyMessage(void)
 {
 	XLogRecPtr	writePtr,
-flushPtr,
+flushPtr, oldFlushPtr,
 applyPtr;
 	bool		replyRequested;
 	TimeOffset	writeLag,
@@ -1798,6 +1812,7 @@ ProcessStandbyReplyMessage(void)
 		WalSnd	   

Re: [HACKERS] expanding inheritance in partition bound order

2017-08-28 Thread Amit Langote
On 2017/08/26 3:28, Robert Haas wrote:
> On Mon, Aug 21, 2017 at 2:10 AM, Amit Langote
>  wrote:
>> [ new patches ]
> 
> I am failing to understand the point of separating PartitionDispatch
> into PartitionDispatch and PartitionTableInfo.  That seems like an
> unnecessary multiplication of entities, as does the introduction of
> PartitionKeyInfo.  I also think that replacing reldesc with reloid is
> not really an improvement; any places that gets the relid has to go
> open the relation to get the reldesc, whereas without that it has a
> direct pointer to the information it needs.

I am worried about the open relcache reference in PartitionDispatch when
we start using it in the planner.  Whereas there is a ExecEndModifyTable()
as a suitable place to close that reference, there doesn't seem to exist
one within the planner, but I guess we will have to figure something out.
For time being, the second patch closes the same in
expand_inherited_rtentry() right after picking up the OID using
RelationGetRelid(pd->reldesc).

> I suggest that this patch just focus on removing the following things
> from PartitionDispatchData: keystate, tupslot, tupmap.  Those things
> are clearly executor-specific stuff that makes sense to move to a
> different structure, what you're calling PartitionTupleRoutingInfo
> (not sure that's the best name).  The other stuff all seems fine.
> You're going to have to open the relation anyway, so keeping the
> reldesc around seems like an optimization, if anything.  The
> PartitionKey and PartitionDesc pointers may not really be needed --
> they're just pointers into reldesc -- but they're trivial to compute,
> so it doesn't hurt anything to have them either as a
> micro-optimization for performance or even just for readability.

OK, done this way in the attached updated patch.  Any suggestions about a
better name for what the patch calls PartitionTupleRoutingInfo?

> That just leaves indexes.  In a world where keystate, tupslot, and
> tupmap are removed from the PartitionDispatchData, you must need
> indexes or there would be no point in constructing a
> PartitionDispatchData object in the first place; any application that
> needs neither indexes nor the executor-specific stuff could just use
> the Relation directly.

Agreed.

> Regarding your XXX comments, note that if you've got a lock on a
> relation, the pointers to the PartitionKey and PartitionDesc are
> stable.  The PartitionKey can't change once it's established, and the
> PartitionDesc can't change while we've got a lock on the relation
> unless we change it ourselves (and any places that do should have
> CheckTableNotInUse checks).  The keep_partkey and keep_partdesc
> handling in relcache.c exists exactly so that we can guarantee that
> the pointer won't go stale under us.  Now, if we *don't* have a lock
> on the relation, then those pointers can easily be invalidated -- so
> you can't hang onto a PartitionDispatch for longer than you hang onto
> the lock on the Relation.  But that shouldn't be a problem.  I think
> you only need to hang onto PartitionDispatch pointers for the lifetime
> of a single query.  One can imagine optimizations where we try to
> avoid rebuilding that for subsequent queries but I'm not sure there's
> any demonstrated need for such a system at present.

Here too.

Attached are the updated patches.

Thanks,
Amit
From fb4bd4818c4faa08b3c4d37709f01dc55f256a46 Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 24 Jul 2017 18:59:57 +0900
Subject: [PATCH 1/2] Decouple RelationGetPartitionDispatchInfo() from executor

Currently it and the structure it generates viz. PartitionDispatch
objects are too coupled with the executor's tuple-routing code.  In
particular, it's pretty undesirable that it makes it the responsibility
of the caller to release some resources, such as executor tuple table
slots, tuple-conversion maps, etc.  That makes it harder to use in
places other than where it's currently being used.

After this refactoring, ExecSetupPartitionTupleRouting() now needs to
do some of the work that was previously done in
RelationGetPartitionDispatchInfo().
---
 src/backend/catalog/partition.c| 278 +++--
 src/backend/commands/copy.c|  37 +++--
 src/backend/executor/execMain.c| 124 +--
 src/backend/executor/nodeModifyTable.c |  37 +++--
 src/include/catalog/partition.h|  34 ++--
 src/include/executor/executor.h|   4 +-
 src/include/nodes/execnodes.h  |  40 -
 7 files changed, 326 insertions(+), 228 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 96a64ce6b2..25fc4583de 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -981,181 +981,147 @@ get_partition_qual_relid(Oid relid)
 }
 
 /*
- * Append OIDs of rel's partitions to the list 'partoids' and for each OID,
- * append pointer rel to 

Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Alvaro Herrera
Craig Ringer wrote:
> On 28 August 2017 at 15:19, Michael Paquier 
> wrote:
> 
> > On Mon, Aug 28, 2017 at 4:07 PM, Craig Ringer 
> > wrote:
> > > == starting postmaster==
> > > running with PID 30235; connect with:
> > >   psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'"
> > > == creating database "regression" ==
> >
> > Sorry if my words were confusing and have cost you three minutes of
> > development. I like better the one-line version :)
> > Now a socket path could be quite long. I can live with that personally.
> >
> 
> I'm not fussed, I just think we should show it one way or the other.
> 
> One nice thing about the two line form is that you can
> double-click/middle-click to open a new psql in the pg_regress session
> pretty much instantly.

So don't add gettext_noop() around it :-)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Restricting maximum keep segments by repslots

2017-08-28 Thread Kyotaro HORIGUCHI
Hello,

I'll add this to CF2017-09.

At Mon, 06 Mar 2017 18:20:06 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170306.182006.172683338.horiguchi.kyot...@lab.ntt.co.jp>
> Thank you for the comment.
> 
> At Fri, 3 Mar 2017 14:47:20 -0500, Peter Eisentraut 
>  wrote in 
> 
> > On 3/1/17 19:54, Kyotaro HORIGUCHI wrote:
> > >> Please measure it in size, not in number of segments.
> > > It was difficult to dicide which is reaaonable but I named it
> > > after wal_keep_segments because it has the similar effect.
> > > 
> > > In bytes(or LSN)
> > >  max_wal_size
> > >  min_wal_size
> > >  wal_write_flush_after
> > > 
> > > In segments
> > >  wal_keep_segments
> > 
> > We have been moving away from measuring in segments.  For example,
> > checkpoint_segments was replaced by max_wal_size.
> > 
> > Also, with the proposed patch that allows changing the segment size more
> > easily, this will become more important.  (I wonder if that will require
> > wal_keep_segments to change somehow.)
> 
> Agreed. It is 'max_slot_wal_keep_size' in the new version.
> 
> wal_keep_segments might should be removed someday.

- Following to min/max_wal_size, the variable was renamed to
  "max_slot_wal_keep_size_mb" and used as ConvertToXSegs(x)"

- Stopped warning when checkpoint doesn't flush segments required
  by slots even if max_slot_wal_keep_size have worked.

- Avoided subraction that may be negative.

regards,

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 105,110  int			wal_level = WAL_LEVEL_MINIMAL;
--- 105,111 
  int			CommitDelay = 0;	/* precommit delay in microseconds */
  int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
  int			wal_retrieve_retry_interval = 5000;
+ int			max_slot_wal_keep_size_mb = 0;
  
  #ifdef WAL_DEBUG
  bool		XLOG_DEBUG = false;
***
*** 9353,9361  KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
--- 9354,9385 
  	if (max_replication_slots > 0 && keep != InvalidXLogRecPtr)
  	{
  		XLogSegNo	slotSegNo;
+ 		int			slotlimitsegs = ConvertToXSegs(max_slot_wal_keep_size_mb);
  
  		XLByteToSeg(keep, slotSegNo);
  
+ 		/*
+ 		 * ignore slots if too many wal segments are kept.
+ 		 * max_slot_wal_keep_size is just accumulated on wal_keep_segments.
+ 		 */
+ 		if (max_slot_wal_keep_size_mb > 0 && slotSegNo + slotlimitsegs < segno)
+ 		{
+ 			segno = segno - slotlimitsegs; /* must be positive */
+ 
+ 			/*
+ 			 * warn only if the checkpoint flushes the required segment.
+ 			 * we assume here that *logSegNo is calculated keep location.
+ 			 */
+ 			if (slotSegNo < *logSegNo)
+ ereport(WARNING,
+ 	(errmsg ("restart LSN of replication slots is ignored by checkpoint"),
+ 	 errdetail("Some replication slots have lost required WAL segnents to continue by up to %ld segments.",
+ 	   (segno < *logSegNo ? segno : *logSegNo) - slotSegNo)));
+ 
+ 			/* emergency vent */
+ 			slotSegNo = segno;
+ 		}
+ 
  		if (slotSegNo <= 0)
  			segno = 1;
  		else if (slotSegNo < segno)
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 2366,2371  static struct config_int ConfigureNamesInt[] =
--- 2366,2382 
  	},
  
  	{
+ 		{"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING,
+ 			gettext_noop("Sets the maximum size of extra WALs kept by replication slots."),
+ 		 NULL,
+ 		 GUC_UNIT_MB
+ 		},
+ 		_slot_wal_keep_size_mb,
+ 		0, 0, INT_MAX,
+ 		NULL, NULL, NULL
+ 	},
+ 
+ 	{
  		{"wal_sender_timeout", PGC_SIGHUP, REPLICATION_SENDING,
  			gettext_noop("Sets the maximum time to wait for WAL replication."),
  			NULL,
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***
*** 235,240 
--- 235,241 
  #max_wal_senders = 10		# max number of walsender processes
  # (change requires restart)
  #wal_keep_segments = 0		# in logfile segments, 16MB each; 0 disables
+ #max_slot_wal_keep_size = 0	# measured in bytes; 0 disables
  #wal_sender_timeout = 60s	# in milliseconds; 0 disables
  
  #max_replication_slots = 10	# max number of replication slots
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
***
*** 97,102  extern bool reachedConsistency;
--- 97,103 
  extern int	min_wal_size_mb;
  extern int	max_wal_size_mb;
  extern int	wal_keep_segments;
+ extern int	max_slot_wal_keep_size_mb;
  extern int	XLOGbuffers;
  extern int	XLogArchiveTimeout;
  extern int	wal_retrieve_retry_interval;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Pavel Stehule
2017-08-28 11:05 GMT+02:00 Craig Ringer :

> On 28 August 2017 at 16:23, Fabien COELHO  wrote:
>
>>
>> This doesn't really address the original issue though, that it's far from
>>> obvious how to easily and correctly script psql.
>>>
>>
>> That is another interesting argument. I understood that the issue was
>> having to type these options, but now it is also to remember which one are
>> relevant and wanted, which is a little different and more justifiable as an
>> option.
>>
>> On that account, ISTM that '|' as a field separator is debatable, that
>> pager should be turned off... and maybe a few other things.
>>
>
>
> Good point re pager, though it's turned off automatically when stdout
> isn't a terminal, so in practice that'll only matter if you're using
> something like 'expect' that uses a pty. Still worth doing.
>
> I agree that | is a bit iffy, but so's anything really. null bytes aren't
> usable for all scripts, and nothing else cannot also be output in the data
> its self. No easy answers there. In cases where I expect that to be an
> issue I sometimes use \COPY ... TO STDOUT WITH (FORMAT CSV) though.
>

I don't doubt about a sense of this configuration - but this specific
combination depends on usage - so I don't think so using special option is
good idea.

Regards

Pavel


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT

2017-08-28 Thread Kyotaro HORIGUCHI
Hello,

I'll add this to CF2017-09.

At Tue, 27 Jun 2017 16:27:18 +0900, Amit Langote 
 wrote in 
<75fe42df-b1d8-89ff-596d-d9da0749e...@lab.ntt.co.jp>
> On 2017/06/26 18:44, Kyotaro HORIGUCHI wrote:
> > At Mon, 26 Jun 2017 18:16:42 +0900, Amit Langote wrote:
> >>
> >> I recall I had proposed a fix for the same thing some time ago [1].
> > 
> > Wow. About 1.5 years ago and stuck? I have a concrete case where
> > this harms  so I'd like to fix that this time. How can we move on?
> 
> Agreed that this should be fixed.
> 
> Your proposed approach #1 to recheck the inheritance after obtaining the
> lock on the child table seems to be a good way forward.
> 
> Approach #2 of reordering locking is a simpler solution, but does not
> completely prevent the problem, because DROP TABLE child can still cause
> it to occur, as you mentioned.
> 
> >>> The cause is that NO INHERIT doesn't take an exlusive lock on the
> >>> parent. This allows expand_inherited_rtentry to add the child
> >>> relation into appendrel after removal from the inheritance but
> >>> still exists.
> >>
> >> Right.
> >>
> >>> I see two ways to fix this.
> >>>
> >>> The first patch adds a recheck of inheritance relationship if the
> >>> corresponding attribute is missing in the child in
> >>> make_inh_translation_list(). The recheck is a bit complex but it
> >>> is not performed unless the sequence above is happen. It checks
> >>> duplication of relid (or cycles in inheritance) following
> >>> find_all_inheritors (but doing a bit different) but I'm not sure
> >>> it is really useful.
> >>
> >> I had proposed adding a syscache on pg_inherits(inhrelid, inhparent), but
> >> I guess your hash table based solution will do the job as far as
> >> performance of this check is concerned, although I haven't checked the
> >> code closely.
> > 
> > The hash table is not crucial in the patch. Substantially it just
> > recurs down looking up pg_inherits for the child. I considered
> > the new index but abandoned because I thought that such case
> > won't occur so frequently.
> 
> Agreed.  BTW, the hash table in patch #1 does not seem to be really
> helpful.  In the while loop in is_descendant_of_internal(), does
> hash_search() ever return found = true?  AFAICS, it does not.
> 
> >>> The second patch lets ALTER TABLE NO INHERIT to acquire locks on
> >>> the parent first.
> >>>
> >>> Since the latter has a larger impact on the current behavior and
> >>> we already treat "DROP TABLE child" case in the similar way, I
> >>> suppose that the first approach would be preferable.
> >>
> >> That makes sense.

So, I attached only the first patch, rebased on the current
master (It actually failed to apply on it.) and fixed a typo in a
comment.

This still runs a closed-loop test using temporary hash but it
seems a bit paranoic. (this is the same check with what
find_all_inheritors is doing)


<< the following is another topic >>

> >> BTW, in the partitioned table case, the parent is always locked first
> >> using an AccessExclusiveLock.  There are other considerations in that case
> >> such as needing to recreate the partition descriptor upon termination of
> >> inheritance (both the DETACH PARTITION and also DROP TABLE child cases).
> > 
> > Apart from the degree of concurrency, if we keep parent->children
> > order of locking, such recreation does not seem to be
> > needed. Maybe I'm missing something.
> 
> Sorry to have introduced that topic in this thread, but I will try to
> explain anyway why things are the way they are currently:
> 
> Once a table is no longer a partition of the parent (detached or dropped),
> we must make sure that the next commands in the transaction don't see it
> as one.  That information is currently present in the relcache
> (rd_partdesc), which is used by a few callers, most notably the
> tuple-routing code.  Next commands must recreate the entry so that the
> correct thing happens based on the updated information.  More precisely,
> we must invalidate the current entry.  RelationClearRelation() will either
> delete the entry or rebuild it.  If it's being referenced somewhere, it
> will be rebuilt.  The place holding the reference may also be looking at
> the content of rd_partdesc, which we don't require them to make a copy of,
> so we must preserve its content while other fields of RelationData are
> being read anew from the catalog.  We don't have to preserve it if there
> has been any change (partition added/dropped), but to make such a change
> one would need to take a strong enough lock on the relation (parent).  We
> assume here that anyone who wants to reference rd_partdesc takes at least
> AccessShareLock lock on the relation, and anyone who wants to change its
> content must take a lock that will conflict with it, so
> AccessExclusiveLock.  Note that in all of this, we are only talking about
> one relation, that is the parent, so parent -> child ordering of taking
> locks may be irrelevant.

I think I 

Re: [HACKERS] show "aggressive" or not in autovacuum logs

2017-08-28 Thread Kyotaro HORIGUCHI
Hello,

Currently the message shows the '%d skipped-frozen' message but
it is insufficient to verify the true effect. This is a patch to
show mode as 'aggressive' or 'normal' in the closing message of
vacuum. %d frozen-skipped when 'aggressive mode' shows the true
effect of ALL_FROZEN.

I will add this patch to CF2017-09.

At Tue, 4 Apr 2017 20:29:38 +0900, Masahiko Sawada  
wrote in 
> On Tue, Apr 4, 2017 at 10:09 AM, Kyotaro HORIGUCHI
>  wrote:
> > | =# vacuum freeze verbose it;
> > | INFO:  vacuuming "public.it" in aggressive mode
> > | INFO:  "it": found 0 removable, 0 nonremovable row versions in 0 out of 0 
> > pages
> > ...
> > | Skipped 0 pages due to buffer pins, 0 frozen pages.
> >
> > I still feel a bit uneasy about the word "aggressive" here.
> 
> I think we can use the word "aggressive" here since we already use the
> word "aggressive vacuum" in docs[1], but it might be easily
> misunderstood.
> 
> [1] https://www.postgresql.org/docs/9.6/static/routine-vacuuming.html
> 
> >Is it better to be "freezing" or something?
> 
> An another idea can be something like "prevent wraparound". The
> autovaucum process doing aggressive vacuum appears in pg_stat_activity
> with the word " (to prevent wraparound)". This word might be more
> user friendly IMO.

Hmm. This appears to be in several form.

https://www.postgresql.org/docs/devel/static/sql-vacuum.html

> aggressive “freezing” of tuples. ... Aggressive freezing

https://www.postgresql.org/docs/devel/static/routine-vacuuming.html

> VACUUM will perform an aggressive vacuum,
> an anti-wraparound autovacuum

https://www.postgresql.org/docs/devel/static/runtime-config-client.html

> ACUUM performs an aggressive scan

ps title

> (to prevent wraparound)

The nearest common wording seems to be just aggressive (vacuum)
so I left it alone in the attached patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 009507d5ddb33229e4c866fef206962de39317cc Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 28 Aug 2017 13:12:25 +0900
Subject: [PATCH] Show "aggressive" or not in vacuum messages

VACUUM VERBOSE or autovacuum emits log message with "n skipped-frozen"
but we cannot tell whether the vacuum was non-freezing (or not
aggressive) vacuum or freezing (or aggressive) vacuum having no tuple
to freeze. This patch adds indication of "aggressive" or "normal" in
autovacuum logs and VACUUM VERBOSE message
---
 src/backend/commands/vacuumlazy.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 45b1859..71ecd50 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -373,10 +373,11 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 			 * emitting individual parts of the message when not applicable.
 			 */
 			initStringInfo();
-			appendStringInfo(, _("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"),
+			appendStringInfo(, _("automatic vacuum of table \"%s.%s.%s\": mode: %s, index scans: %d\n"),
 			 get_database_name(MyDatabaseId),
 			 get_namespace_name(RelationGetNamespace(onerel)),
 			 RelationGetRelationName(onerel),
+			 aggressive ? "aggressive" : "normal",
 			 vacrelstats->num_index_scans);
 			appendStringInfo(, _("pages: %u removed, %u remain, %u skipped due to pins, %u skipped frozen\n"),
 			 vacrelstats->pages_removed,
@@ -487,9 +488,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 
 	relname = RelationGetRelationName(onerel);
 	ereport(elevel,
-			(errmsg("vacuuming \"%s.%s\"",
+			(errmsg("vacuuming \"%s.%s\" in %s mode",
 	get_namespace_name(RelationGetNamespace(onerel)),
-	relname)));
+	relname, aggressive ? "aggressive" : "normal")));
 
 	empty_pages = vacuumed_pages = 0;
 	num_tuples = tups_vacuumed = nkeep = nunused = 0;
-- 
2.9.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-08-28 Thread Kyotaro HORIGUCHI
Thank you for your attention.

At Mon, 14 Aug 2017 17:33:48 -0400, Peter Eisentraut 
 wrote in 
<09fa011f-4536-b05d-0625-11f3625d8...@2ndquadrant.com>
> On 1/24/17 02:58, Kyotaro HORIGUCHI wrote:
> >> BTW, if you set a slightly larger
> >> context size on the patch you might be able to avoid rebases; right
> >> now the patch doesn't include enough context to uniquely identify the
> >> chunks against cacheinfo[].
> > git format-patch -U5 fuses all hunks on cacheinfo[] together. I'm
> > not sure that such a hunk can avoid rebases. Is this what you
> > suggested? -U4 added an identifiable forward context line for
> > some elements so the attached patch is made with four context
> > lines.
> 
> This patch needs another rebase for the upcoming commit fest.

This patch have had interferences from several commits after the
last submission. I amended this patch to follow them (up to
f97c55c), removed an unnecessary branch and edited some comments.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 4887f7d178b41a1a4729931a12bd396b9a8e8ee0 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 28 Aug 2017 11:36:21 +0900
Subject: [PATCH 1/2] Cleanup negative cache of pg_statistic when dropping a
 relation.

Accessing columns that don't have statistics leaves negative entries
in catcache for pg_statstic, but there's no chance to remove
them. Especially when repeatedly creating then dropping temporary
tables bloats catcache so much that memory pressure becomes
significant. This patch removes negative entries in STATRELATTINH,
ATTNAME and ATTNUM when corresponding relation is dropped.
---
 src/backend/utils/cache/catcache.c |  57 ++-
 src/backend/utils/cache/syscache.c | 299 +++--
 src/include/utils/catcache.h   |   3 +
 src/include/utils/syscache.h   |   2 +
 4 files changed, 279 insertions(+), 82 deletions(-)

diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index e092801..e50c997 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -304,10 +304,11 @@ CatCachePrintStats(int code, Datum arg)
 
 		if (cache->cc_ntup == 0 && cache->cc_searches == 0)
 			continue;			/* don't print unused caches */
-		elog(DEBUG2, "catcache %s/%u: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits",
+		elog(DEBUG2, "catcache %s/%u: %d tup, %d negtup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits",
 			 cache->cc_relname,
 			 cache->cc_indexoid,
 			 cache->cc_ntup,
+			 cache->cc_nnegtup,
 			 cache->cc_searches,
 			 cache->cc_hits,
 			 cache->cc_neg_hits,
@@ -374,6 +375,10 @@ CatCacheRemoveCTup(CatCache *cache, CatCTup *ct)
 	/* free associated tuple data */
 	if (ct->tuple.t_data != NULL)
 		pfree(ct->tuple.t_data);
+
+	if (ct->negative)
+		--cache->cc_nnegtup;
+
 	pfree(ct);
 
 	--cache->cc_ntup;
@@ -572,6 +577,49 @@ ResetCatalogCache(CatCache *cache)
 }
 
 /*
+ *		CleanupCatCacheNegEntries
+ *
+ *	Remove negative cache tuples matching a partial key.
+ *
+ */
+void
+CleanupCatCacheNegEntries(CatCache *cache, ScanKeyData *skey)
+{
+	int i;
+
+	/* If this cache has no negative entries, nothing to do */
+	if (cache->cc_nnegtup == 0)
+		return;
+
+	/* searching with a paritial key means scanning the whole cache */
+	for (i = 0; i < cache->cc_nbuckets; i++)
+	{
+		dlist_head *bucket = >cc_bucket[i];
+		dlist_mutable_iter iter;
+
+		dlist_foreach_modify(iter, bucket)
+		{
+			CatCTup*ct = dlist_container(CatCTup, cache_elem, iter.cur);
+			bool		res;
+
+			if (!ct->negative)
+continue;
+
+			HeapKeyTest(>tuple, cache->cc_tupdesc, 1, skey, res);
+			if (!res)
+continue;
+
+			/*
+			 * the negative cache entries can no longer be referenced, so we
+			 * can remove it unconditionally
+			 */
+			CatCacheRemoveCTup(cache, ct);
+		}
+	}
+}
+
+
+/*
  *		ResetCatalogCaches
  *
  * Reset all caches when a shared cache inval event forces it
@@ -718,6 +766,7 @@ InitCatCache(int id,
 	cp->cc_relisshared = false; /* temporary */
 	cp->cc_tupdesc = (TupleDesc) NULL;
 	cp->cc_ntup = 0;
+	cp->cc_nnegtup = 0;
 	cp->cc_nbuckets = nbuckets;
 	cp->cc_nkeys = nkeys;
 	for (i = 0; i < nkeys; ++i)
@@ -1215,8 +1264,8 @@ SearchCatCache(CatCache *cache,
 
 		CACHE4_elog(DEBUG2, "SearchCatCache(%s): Contains %d/%d tuples",
 	cache->cc_relname, cache->cc_ntup, CacheHdr->ch_ntup);
-		CACHE3_elog(DEBUG2, "SearchCatCache(%s): put neg entry in bucket %d",
-	cache->cc_relname, hashIndex);
+		CACHE4_elog(DEBUG2, "SearchCatCache(%s): put neg entry in bucket %d, total %d",
+	cache->cc_relname, hashIndex, cache->cc_nnegtup);
 
 		/*
 		 * We are not returning the negative entry to the caller, so leave its
@@ -1667,6 +1716,8 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp,
 
 	cache->cc_ntup++;
 	CacheHdr->ch_ntup++;
+	if (negative)
+		cache->cc_nnegtup++;
 
 	/*

Re: [HACKERS] WIP: Separate log file for extension

2017-08-28 Thread Antonin Houska
Craig Ringer  wrote:

> On 25 August 2017 at 15:12, Antonin Houska  wrote:
> 
> How will this play with syslog? csvlog? etc?

There's nothing special about csvlog: the LogStream structure has a
"destination" field, so if particular extension wants this kind of output, it
simply sets the LOG_DESTINATION_CSVLOG bit here.

LOG_DESTINATION_SYSLOG is a problem because (AFAIK) a single process can
maintain no more than one connection to the syslog server. I don't like the
idea of closing the current connection and opening a different one whenever
the next ereport() is associated with a different stream than the previous
was.

As for LOG_DESTINATION_EVENTLOG, I haven't checked yet.

> I wonder if a level of indirection is appropriate here, where extensions (or
> postgres subsystems, even) provide a log stream label. Then the logging
> backed takes care of using that appropriately for the logging mechanism in
> use; for logging to file that'd generally be separate files. Same for
> CSVlog. Other mechanisms could be left as stubs initially.
> 
> So the outcome would be the same, just without the assumption of specific
> file name and output mechanism baked in.

The example I shown in my previous email was the simplest case I could think
of. But it does not mean that the file name has to be hard-coded in the
extension. Instead of setting the LogStream.filename field, you can pass a
pointer to this field to DefineCustomStringVariable() function, so the
specific GUC can control the value.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Craig Ringer
On 28 August 2017 at 16:23, Fabien COELHO  wrote:

>
> This doesn't really address the original issue though, that it's far from
>> obvious how to easily and correctly script psql.
>>
>
> That is another interesting argument. I understood that the issue was
> having to type these options, but now it is also to remember which one are
> relevant and wanted, which is a little different and more justifiable as an
> option.
>
> On that account, ISTM that '|' as a field separator is debatable, that
> pager should be turned off... and maybe a few other things.
>


Good point re pager, though it's turned off automatically when stdout isn't
a terminal, so in practice that'll only matter if you're using something
like 'expect' that uses a pty. Still worth doing.

I agree that | is a bit iffy, but so's anything really. null bytes aren't
usable for all scripts, and nothing else cannot also be output in the data
its self. No easy answers there. In cases where I expect that to be an
issue I sometimes use \COPY ... TO STDOUT WITH (FORMAT CSV) though.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist

2017-08-28 Thread Ryan Murphy
Hello Postgres Hackers,

I want to become more helpful to the project, reviewing more patches and
starting to write more of my own - and one of the first steps is to get all
the tests passing so I can confidently review patches.  It almost worked...

I typed "make check" and all the tests passed.
Then I typed "make installcheck" and all the tests passed again.

But when I typed "make installcheck-world", I got this:

===
 1 of 55 tests failed.
===

The differences that caused some tests to fail can be viewed in the
file "path/to/postgres/src/interfaces/ecpg/test/regression.diffs".  A copy
of the test summary that you see
above is saved in the file
"path/to/postgres/src/interfaces/ecpg/test/regression.out".

And when I look at that diffs file, this is what I see:

- [NO_PID]: ECPGconnect: could not open database: FATAL:  database
"regress_ecpg_user2" does not exist

Why would this database not be getting created?  The full diffs file is
attached.

Thanks for your help!
Ryan


regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] hash partitioning

2017-08-28 Thread Yugo Nagata
Hi young,

On Mon, 28 Aug 2017 15:33:46 +0800
"yang...@highgo.com"  wrote:

> Hello
> 
> Looking at your hash partitioning syntax, I implemented a hash partition in a 
> more concise way, with no need to determine the number of sub-tables, and 
> dynamically add partitions.

I think it is great work, but the current consensus about hash-partitioning 
supports 
Amul's patch[1], in which the syntax is different from the my original 
proposal. 
So, you will have to read Amul's patch and make a discussion if you still want 
to
propose your implementation.

Regards,

[1] 
https://www.postgresql.org/message-id/CAAJ_b965A2oog=6efuhelexl3rmgfssb3g7lwkva1bw0wuj...@mail.gmail.com


> 
> Description
> 
> The hash partition's implement is on the basis of the original range / list 
> partition,and using similar syntax.
> 
> To create a partitioned table ,use:
> 
> CREATE TABLE h (id int) PARTITION BY HASH(id);
> 
> The partitioning key supports only one value, and I think the partition key 
> can support multiple values, 
> which may be difficult to implement when querying, but it is not impossible.
> 
> A partition table can be create as bellow:
> 
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>  
> FOR VALUES clause cannot be used, and the partition bound is calclulated 
> automatically as partition index of single integer value.
> 
> An inserted record is stored in a partition whose index equals 
> DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
> TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions 
> */
> ;
> In the above example, this is 
> DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
> TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;
> 
> postgres=# insert into h select generate_series(1,20);
> INSERT 0 20
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id 
> --+
>  h1   |  3
>  h1   |  5
>  h1   | 17
>  h1   | 19
>  h2   |  2
>  h2   |  6
>  h2   |  7
>  h2   | 11
>  h2   | 12
>  h2   | 14
>  h2   | 15
>  h2   | 18
>  h2   | 20
>  h3   |  1
>  h3   |  4
>  h3   |  8
>  h3   |  9
>  h3   | 10
>  h3   | 13
>  h3   | 16
> (20 rows)
> 
> The number of partitions here can be dynamically added, and if a new 
> partition is created, the number of partitions changes, the calculated target 
> partitions will change, and the same data is not reasonable in different 
> partitions,So you need to re-calculate the existing data and insert the 
> target partition when you create a new partition.
> 
> postgres=# create table h4 partition of h;
> CREATE TABLE
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id 
> --+
>  h1   |  5
>  h1   | 17
>  h1   | 19
>  h1   |  6
>  h1   | 12
>  h1   |  8
>  h1   | 13
>  h2   | 11
>  h2   | 14
>  h3   |  1
>  h3   |  9
>  h3   |  2
>  h3   | 15
>  h4   |  3
>  h4   |  7
>  h4   | 18
>  h4   | 20
>  h4   |  4
>  h4   | 10
>  h4   | 16
> (20 rows)
> 
> When querying the data, the hash partition uses the same algorithm as the 
> insertion, and filters out the table that does not need to be scanned.
> 
> postgres=# explain analyze select * from h where id = 1;
>  QUERY PLAN   
>   
> 
>  Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 
> loops=1)
>->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual 
> time=0.013..0.016 rows=1 loops=1)
>  Filter: (id = 1)
>  Rows Removed by Filter: 3
>  Planning time: 0.346 ms
>  Execution time: 0.061 ms
> (6 rows)
> 
> postgres=# explain analyze select * from h where id in (1,5);;
>  QUERY PLAN   
>   
> 
>  Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 
> loops=1)
>->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual 
> time=0.015..0.018 rows=1 loops=1)
>  Filter: (id = ANY ('{1,5}'::integer[]))
>  Rows Removed by Filter: 6
>->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual 
> time=0.005..0.007 rows=1 loops=1)
>  Filter: (id = ANY ('{1,5}'::integer[]))
>  Rows Removed by Filter: 3
>  Planning time: 0.720 ms
>  Execution time: 0.074 ms
> (9 rows)
> 
> postgres=# explain analyze select * from h where id = 1 or id = 5;;
>  QUERY PLAN   
>   
> 

Re: [HACKERS] psql --batch

2017-08-28 Thread Fabien COELHO



This doesn't really address the original issue though, that it's far from
obvious how to easily and correctly script psql.


That is another interesting argument. I understood that the issue was 
having to type these options, but now it is also to remember which one are 
relevant and wanted, which is a little different and more justifiable as 
an option.


On that account, ISTM that '|' as a field separator is debatable, that 
pager should be turned off... and maybe a few other things.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-28 Thread Fabien COELHO


Hello Masahiko-san,

Patch applies cleanly, compiles, works for me.


At least: "Required to invoke" -> "Require to invoke".


Fixed.


I meant the added one about -I, not the pre-existing one about -i.


About the code:

is_no_vacuum should be a bool?


We can change it but I think there is no difference actually. So
keeping it would be better.


I would like to insist a little bit: the name was declared as an int but 
passed to init and used as a bool there before the patch. Conceptually 
what is meant is really a bool, and I see no added value bar saving a few 
strokes to have an int. ISTM that recent pgbench changes have started 
turning old int-for-bool habits into using bool when bool is meant.


initialize_cmds initialization: rather use pg_strdup instead of 
pg_malloc/strcpy?


-I: pg_free before pg_strdup to avoid a small memory leak?

I'm not sure I would have bothered with sizeof(char), but why not!

I'm still a little bit annoyed by checkCustomCmds: when unhappy, it shows 
an error message and return false, which immediatly results in exit(1). 
However the pattern elsewhere in pgbench is to show the error and exit 
immediatly. I would suggest to simplify by void-ing the function and 
exiting instead of returning false.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Craig Ringer
On 28 August 2017 at 15:34, Pavel Stehule  wrote:

>
>
> 2017-08-28 9:33 GMT+02:00 Fabien COELHO :
>
>>
>> ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally
 encountered as well, the other one letter ones are not too bad. Maybe it
 would be enough to have a shortcut for this one, say "-B"?

>>>
>>> I agree with last sentence. I don't think so -qAtX are expected always,
>>> but
>>> "-v ON_ERRORS_STOP=1" is pretty often.
>>>
>>
>> Yep. I often "\set" that in the script.
>>
>> What do you think about long option "--on-errors-stop" ?
>>>
>>
>> It does not really relieve the typing pain Craig is complaining about,
>> but it would be ok as the long option version if -B is added, and it is
>> auto-completion friendly.
>
>
>
This doesn't really address the original issue though, that it's far from
obvious how to easily and correctly script psql.

I guess there's always the option of a docs patch for that. *shrug*

I'll see what others have to say.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Craig Ringer
On 28 August 2017 at 15:19, Michael Paquier 
wrote:

> On Mon, Aug 28, 2017 at 4:07 PM, Craig Ringer 
> wrote:
> > == starting postmaster==
> > running with PID 30235; connect with:
> >   psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'"
> > == creating database "regression" ==
>
> Sorry if my words were confusing and have cost you three minutes of
> development. I like better the one-line version :)
> Now a socket path could be quite long. I can live with that personally.
>

I'm not fussed, I just think we should show it one way or the other.

One nice thing about the two line form is that you can
double-click/middle-click to open a new psql in the pg_regress session
pretty much instantly.




-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] psql --batch

2017-08-28 Thread Pavel Stehule
2017-08-28 9:33 GMT+02:00 Fabien COELHO :

>
> ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally
>>> encountered as well, the other one letter ones are not too bad. Maybe it
>>> would be enough to have a shortcut for this one, say "-B"?
>>>
>>
>> I agree with last sentence. I don't think so -qAtX are expected always,
>> but
>> "-v ON_ERRORS_STOP=1" is pretty often.
>>
>
> Yep. I often "\set" that in the script.
>
> What do you think about long option "--on-errors-stop" ?
>>
>
> It does not really relieve the typing pain Craig is complaining about, but
> it would be ok as the long option version if -B is added, and it is
> auto-completion friendly.


ok

Pavel

>
>
> --
> Fabien.
>


Re: [HACKERS] psql --batch

2017-08-28 Thread Fabien COELHO



ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally
encountered as well, the other one letter ones are not too bad. Maybe it
would be enough to have a shortcut for this one, say "-B"?


I agree with last sentence. I don't think so -qAtX are expected always, but
"-v ON_ERRORS_STOP=1" is pretty often.


Yep. I often "\set" that in the script.


What do you think about long option "--on-errors-stop" ?


It does not really relieve the typing pain Craig is complaining about, but 
it would be ok as the long option version if -B is added, and it is 
auto-completion friendly.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Pavel Stehule
2017-08-28 8:56 GMT+02:00 Fabien COELHO :

>
> I find myself regurgitating the incantation
>>
>> psql -qAtX -v ON_ERRORS_STOP=1
>>
>> quite a bit. It's not ... super friendly.
>>
>> It strikes me that we could possibly benefit from a 'psql --batch' option.
>>
>> Thoughts?
>>
>
> The link between -qAtX and "batch" is not that fully obvious, especially
> the unaligned tuples-only part. If so, why not some -F  as well?
>
> ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally
> encountered as well, the other one letter ones are not too bad. Maybe it
> would be enough to have a shortcut for this one, say "-B"?
>

I agree with last sentence. I don't think so -qAtX are expected always, but
"-v ON_ERROS_STOP=1" is pretty often.

What do you think about long option "---on-errors-stop" ?

Regards

Pavel


> alias?
>
> --
> Fabien.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Michael Paquier
On Mon, Aug 28, 2017 at 4:07 PM, Craig Ringer  wrote:
> == starting postmaster==
> running with PID 30235; connect with:
>   psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'"
> == creating database "regression" ==

Sorry if my words were confusing and have cost you three minutes of
development. I like better the one-line version :)
Now a socket path could be quite long. I can live with that personally.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Craig Ringer
== starting postmaster==
running with PID 30235; connect with:
  psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'"
== creating database "regression" ==


On 28 August 2017 at 14:08, Michael Paquier 
wrote:

> On Mon, Aug 28, 2017 at 2:28 PM, Craig Ringer 
> wrote:
> > It's a pain having to find the postmaster command line to get the port
> > pg_regress started a server on. We print the port in the pg_regress
> output,
> > why not the socket directory / host?
> >
> > How about
> > running on 'port=50848 host=/tmp/pg_regress-UMrcT3' with PID 16409
> >
> > per the attached?
> >
> > If you'd prefer nicer wording at the expense of two lines, maybe
> >
> > running with PID 16409
> > connection string: 'port=50848 host=/tmp/blah'
>
> Yeah, I think that this is a good idea.
> --
> Michael
>



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From c3613d4526ba04fb18011e2af1923b9a167effec Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 28 Aug 2017 13:28:05 +0800
Subject: [PATCH v2] Show sockdir/hostname in pg_regress startup output

---
 src/test/regress/pg_regress.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index abb742b..4cccefc 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -2438,8 +2438,10 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 #else
 #define ULONGPID(x) (unsigned long) (x)
 #endif
-		printf(_("running on port %d with PID %lu\n"),
-			   port, ULONGPID(postmaster_pid));
+		printf(_("running with PID %lu; connect with:\n"),
+ 			   ULONGPID(postmaster_pid));
+		printf(gettext_noop("  psql \"host='%s' port=%d dbname='%s'\"\n"),
+			   hostname ? hostname : sockdir, port, dblist->str);
 	}
 	else
 	{
-- 
2.9.5


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Craig Ringer
On 28 August 2017 at 14:56, Fabien COELHO  wrote:

>
> I find myself regurgitating the incantation
>>
>> psql -qAtX -v ON_ERRORS_STOP=1
>>
>> quite a bit. It's not ... super friendly.
>>
>> It strikes me that we could possibly benefit from a 'psql --batch' option.
>>
>> Thoughts?
>>
>
> The link between -qAtX and "batch" is not that fully obvious, especially
> the unaligned tuples-only part. If so, why not some -F  as well?
>
>
q: quiet

Pretty much always wanted for a batch mode run of anything.

A: unaligned tuples

Because alignment is pretty much never useful when you're parsing result
sets with scripting (splitting, cut, etc) and just makes everything harder.
The alignment psql uses isn't fixed, after all.

t: tuples-only

Headers just make everything more annoying to parse, and force you to do
extra work to skip them. If you're running batch code you know the headers
because you used a column-list form SELECT, or should've. You're unlikely
to be ingesting them and using them to split up the tuple anyway. I think
this one is a bit more arguable than the first two, though, as I can at
least think of some cases where you might want it.

X: skip .psqlrc

Reliable, portable scripted psql shouldn't be using the local .psqlrc IMO.
It's likely to just break things in exciting ways. But I can see it being
reasonable to require this option to be supplied separately and just
document it as "recommended" with --batch.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Re: Poor cost estimate with interaction between table correlation and partial indexes

2017-08-28 Thread Alvaro Herrera
Michael Malis wrote:
> Hmm... It seems the command I used for obtaining a patch I got from
> here https://wiki.postgresql.org/wiki/Working_with_Git truncated part
> of the patch. I've attached the file generated from git diff
> --patience master improve-partial-index-correlation-calculation
> --no-color > improve-correlated-partial-index-cost-v2.patch to this
> email. What is the correct command for generating a context diff?

Yeah, I've had patches truncated by that too and I've never cared enough
to see about getting it fixed.  I think it's a bug in the filterdiff
utility, but I got a stupid answer from the Debian maintainer when I
reported it and didn't care to follow up any further.  Eventually I gave
up on using context diffs because of this problem.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --batch

2017-08-28 Thread Fabien COELHO



I find myself regurgitating the incantation

psql -qAtX -v ON_ERRORS_STOP=1

quite a bit. It's not ... super friendly.

It strikes me that we could possibly benefit from a 'psql --batch' option.

Thoughts?


The link between -qAtX and "batch" is not that fully obvious, especially 
the unaligned tuples-only part. If so, why not some -F  as well?


ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally 
encountered as well, the other one letter ones are not too bad. Maybe it 
would be enough to have a shortcut for this one, say "-B"?


alias?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] NoMovementScanDirection in heapgettup() and heapgettup_pagemode()

2017-08-28 Thread Mithun Cy
Hi all,
I was trying to study NoMovementScanDirection part of heapgettup() and
heapgettup_pagemode(). If I am right there is no test in test suit to
hit this code. I did run make check-world could not hit it. Also,
coverage report in
https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html
shows this part of the code is not hit. Can somebody please help me to
understand this part of the code and how to test same?

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make pg_regress print a connstring with sockdir

2017-08-28 Thread Michael Paquier
On Mon, Aug 28, 2017 at 2:28 PM, Craig Ringer  wrote:
> It's a pain having to find the postmaster command line to get the port
> pg_regress started a server on. We print the port in the pg_regress output,
> why not the socket directory / host?
>
> How about
> running on 'port=50848 host=/tmp/pg_regress-UMrcT3' with PID 16409
>
> per the attached?
>
> If you'd prefer nicer wording at the expense of two lines, maybe
>
> running with PID 16409
> connection string: 'port=50848 host=/tmp/blah'

Yeah, I think that this is a good idea.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers