Re: ToDo: show size of partitioned table

2018-12-17 Thread Amit Langote
Hi,

Thank you for updating the patch.

On 2018/12/17 17:48, Pavel Stehule wrote:
> new update of this patch

Documentation portion of this patch still contains some typos that I
mentioned before here:

https://www.postgresql.org/message-id/1c83bb5c-47cd-d796-226c-e95795b05551%40lab.ntt.co.jp

+ .. If the form \dP+
+is used, the sum of size of related partitions (including the
+table and indexes, if any) and a description
+are also displayed.

+ ... If the form \dPi+
+is used, the sum of size of related indexes and a description
+are also displayed.

+ ... If the form \dPt+
+is used, the sum of size of related tables and a description
+are also displayed.

In all of the three hunks:

the sum of size of -> the sum of "sizes" of

and a description -> and associated description

> changes:
> 
> 1. only root partitioned tables are displayed - you can see quickly total
> allocated space. It is not duplicated due nested partitions.

+1

If one wants to see a non-root partitioned table's details, they can use
\dP+ .

> I can imagine new additional flag - line "n" nested - and then we can
> display nested partitioned tables with parent table info. Some like
> 
> \dPt - show only root partition tables
> \dPnt or \dPtn - show root and nested partitioned tables

Too much complication maybe?

> 2. \dP without pattern shows root partitioned tables + total relation size.
> When pattern is defined, then shows tables and indexes + table size
> 
> postgres=# \dP+
> List of partitioned relations
> ┌┬┬───┬┬─┐
> │ Schema │Name│ Owner │  Size  │ Description │
> ╞╪╪═══╪╪═╡
> │ public │ parent_tab │ pavel │ 120 kB │ │
> └┴┴───┴┴─┘
> (1 row)
> 
> postgres=# \dP+ *
> List of partitioned relations or indexes
> ┌┬──┬───┬───┬┬───┬─┐
> │ Schema │ Name │ Owner │   Type│   Table│ Size  │
> Description │
> ╞╪══╪═══╪═══╪╪═══╪═╡
> │ public │ parent_index │ pavel │ partitioned index │ parent_tab │ 80 kB
> │ │
> │ public │ parent_tab   │ pavel │ partitioned table ││ 40 kB
> │ │
> └┴──┴───┴───┴┴───┴─┘
> (2 rows)
> 
> postgres=# \dP+ *index
> List of partitioned relations or indexes
> ┌┬──┬───┬───┬┬───┬─┐
> │ Schema │ Name │ Owner │   Type│   Table│ Size  │
> Description │
> ╞╪══╪═══╪═══╪╪═══╪═╡
> │ public │ parent_index │ pavel │ partitioned index │ parent_tab │ 80 kB
> │ │
> └┴──┴───┴───┴┴───┴─┘
> (1 row)

Looking at the patch:

+   if (pattern)
+   /* translator: objects_name is "indexes", "tables" or 
"relations" */
+   psql_error("Did not find any partitioned %s named 
\"%s\".\n",
+  objects_name,
+  pattern);
+   else
+   /* translator: object_name is "index", "table" or 
"relation" */
+   psql_error("Did not find any partitioned %s.\n",
+ object_name);

It seems that objects_name and object_name need to be swapped between the
if and else blocks, and so do /* translator: ... */ comments.

if (pattern)
/* translator: object_name is "index", "table" or "relation" */
psql_error(..., object_name);
else
/* translator: objects_name is "indexes", "tables" or "relations" */
psql_error(..., objects_name);

That is, it should say, "Did not find any partitioned index/table/relation
named "foo" and "Did not find any partitioned indexes/tables/relations".

Thanks,
Amit




Discussion: Fast DB/Schema/Table disk size check in Postgresql

2018-12-17 Thread Hubert Zhang
Hi all,

For very large databases, the dbsize function `pg_database_size_name()`
etc. could be quite slow, since it needs to call `stat()` system call on
every file in the target database.

We proposed a new solution to accelerate these dbsize functions which check
the disk usage of database/schema/table. The new functions avoid to call
`stat()` system call, and instead store the disk usage of database objects
in user tables(called diskquota.xxx_size, xxx could be db/schema/table).
Checking the size of database 'postgres' could be converted to the SQL
query `select size from diskquota.db_size where name = `postgres``.

To keep the diskquota.table_size up-to-date, we follow a passive way which
let a separate background worker process to refresh the model periodically
with a user defined delay. Compare with active way, which updates the table
size on-the-fly in executor, our method will not harm the OLTP performance,
all the table size update are done asynchronously. The background worker
process needs to detect the table size change, maintain the table size
model, and flush the table size into user table. Luckily, the first two
parts are already be solved in Diskquota extension.

We have introduced the diskquota extension
 in past. (link to diskquota wiki

 )  Diskquota is an extension to control the disk usage for database
objects like schemas or roles. DBA could set the quota limit for a schema
or a role on a database. Then diskquota worker process will maintain the
diskquota model, which includes the current disk usage of each schema or
role in the database and the blacklist of schema and role whose quota limit
is reached. Loading data into tables, whose schema or role is in diskquota
blacklist, will be cancelled.

To support fast disk size check function, diskquota worker process need to
just add some logics to flush the diskquota model into user table
diskquota.table_size with less code change. Here is the algorithm change:
(for origin diskquota algorithm to refresh diskquota model refer to diskquota
design

)

The change to the original diskquota algorithm is described in bold font
 below(detail link

):

   1. Fetch the latest user defined quota limit by reading table
   'diskquota.quota_config'.
   2. Get active table list(table_oid and size) from Active Tables shared
   memory. Table size is calculated by pg_total_relation_size(table_oid). At
   diskquota model initialization stage (e.g. after restart database), it
   will firstly read table size from table diskquota.table_size to rebuild the
   table_size_map, schema_size_map and role_size_map directly. If table
   diskquota.table_size is empty, then all the tables will be treated as
   active tables. Quota Size Checker will fetch the table size of all the
   tables directly.
   3. Traverse user tables in pg_class:
  1. If table is in active table list, calculate the delta of table
  size change, update the corresponding table_size_map, namespace_size_map
  and role_size_map, update tables diskquota.table_size,
  diskquota.schema_size and diskquota.role_size in current database.
  2. If table's schema change, move the quota from old schema to new
  schema in namespace_size_map, update table diskquota.schema_size in
  current database.
  3. If table's owner change, move the quota from old owner to new
  owner in role_size_map, update table diskquota.role_size in current
  database.
  4. Mark table is existed(not dropped) in table_size_map.
   4. Traverse table_size_map and detect 'dropped' tables in step 3.4.
   Reduce the quota from corresponding namespace_size_map and
role_size_map. update
   tables diskquota.table_size, diskquota.schema_size and diskquota.role_size
   in current database.
   5. Traverse namespace in pg_namespace:
  1. remove the dropped namespace from namespace_size_map.
  2. compare the quota usage and quota limit for each namespace, put
  the out-of-quota namespace into blacklist.
  3. update table diskquota.schema_size in current database.
   6. Traverse role in pg_role:
  1. remove the dropped role from role_size_map.
  2. compare the quota usage and quota limit for each role, put the
  out-of-quota role into blacklist.
  3. update table diskquota.role_size in current database.


Any comment on fast disk size check function is appreciate.
BTW,  diskquota extension need to add some hook functions in Postgres. We
update our patch in commitfest/21/1883
. There is no reviewer yet.
Please help to review this patch if you are interest in diskquota
extension. Thanks in advance!

-- 

Re: Should new partitions inherit their tablespace from their parent?

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 10:13:42PM -0300, Alvaro Herrera wrote:
> I think we need to accept the new output.  It is saying that previously
> we would not invoke the hook on SET TABLESPACE for a partitioned table,
> which makes sense, and now we do invoke it, which also makes sense.

Indeed, that looks right.
--
Michael


signature.asc
Description: PGP signature


Re: Catalog views failed to show partitioned table information.

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 11:01:59AM +0900, Michael Paquier wrote:
> On Mon, Dec 17, 2018 at 10:22:28AM +0900, Amit Langote wrote:
>> On 2018/12/15 8:00, Michael Paquier wrote:
>>> I tend to agree with your comment here.  pg_tables lists partitioned
>>> tables, but pg_indexes is forgotting about partitioned indexes.  So this
>>> is a good thing to add.
>> 
>> +1
> 
> I'll go commit something close to what Suraj is proposing if there are
> no objections from others.  At least we agree on that part ;)

And this part is done.
--
Michael


signature.asc
Description: PGP signature


Re: don't create storage when unnecessary

2018-12-17 Thread Amit Langote
Hi,

On 2018/12/18 14:56, Kyotaro HORIGUCHI wrote:
> Hello.
> 
> At Sun, 16 Dec 2018 17:47:16 -0300, Alvaro Herrera  
> wrote in <20181216204716.jddzijk3fb2dpdk4@alvherre.pgsql>
>> On 2018-Dec-07, Michael Paquier wrote:
>>
>>> A macro makes sense to control that.
>>
>> I added Ashutosh's RELKIND_HAS_STORAGE, but renamed it to
>> RELKIND_CAN_HAVE_STORAGE, because some of the relkinds can be mapped and
>> thus would have relfilenode set to 0.  I think this is a bit misleading
>> either way.
> 
> FWIW.. I RELKIND_HAS_STORAGE looks better to me.
> 
> # Since it's a bit too late, I don't insist on that.
> 
> Mapped relations have storage, which is signalled by relfilenode
> = 0 and the real file node is given by relation mapper. And it is
> actually created at the boostrap time. In this sense,
> (RELKIND_HAS_STORAGE && relfilenode == 0) in heap_craete makes
> sense.
> 
> Assertion (..->relNode != InvalidOid) at the end of
> RelationInitPhysicalAddr doesn't fail. I think RELKIND_HAS_STORGE
> is preferable also in this sense.

Sorry to be saying it late, but I have to agree with Horiguchi-san here
that RELKIND_HAS_STORAGE sounds better and is clear.  Looking at what's
committed:

/*
 * Relation kinds that have physical storage. These relations normally have
 * relfilenode set to non-zero, but it can also be zero if the relation is
 * mapped.
 */
#define RELKIND_CAN_HAVE_STORAGE(relkind) \
((relkind) == RELKIND_RELATION || \
 (relkind) == RELKIND_INDEX || \
 (relkind) == RELKIND_SEQUENCE || \
 (relkind) == RELKIND_TOASTVALUE || \
 (relkind) == RELKIND_MATVIEW)

So, they all *do have* storage, as the comment's first sentence also says.
 Mixing the bit about mapped relations in the naming of this macro will
make it hard to reason about using it in other parts of the backend code
(although, in admittedly not too far off places), where it shouldn't
matter if the relation's file is determined using relfilenode or via
relation to file mapper.

> ATExecSetTableSpaceNoStorage assumes that the relation is
> actually having storage.

Hmm, it has the following Assert:

  /*
   * Shouldn't be called on relations having storage; these are processed
   * in phase 3.
   */
  Assert(!RELKIND_CAN_HAVE_STORAGE(rel->rd_rel->relkind));

So, it actually assumes that it's called for relations that don't have
storage (...but do have a tablespace property that applies to its children.)

Thanks,
Amit




Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)

2018-12-17 Thread Peter Geoghegan
On Mon, Dec 17, 2018 at 10:52 PM Andrey Lepikhov
 wrote:
> I did many tests of your solution inside the 'quick vacuum' strategy [1]
> and the background worker called 'heap cleaner' [2]. I must admit that
> when I  use your patch, there is no problem with dependencies.

Cool.

> This patch needs opinion of an another reviewer.

Was I unclear on why the patch fixes the problem? Sorry, but I thought
it was obvious.

-- 
Peter Geoghegan



Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)

2018-12-17 Thread Andrey Lepikhov




On 08.12.2018 6:58, Peter Geoghegan wrote:

  I have no idea what you mean here. I'm proposing a patch that stops
it being a game of chance, while preserving the existing
slightly-random behavior to the greatest extent possible. I think that
my patch would have stopped that problem altogether. Are you
suggesting that it wouldn't have?


I did many tests of your solution inside the 'quick vacuum' strategy [1] 
and the background worker called 'heap cleaner' [2]. I must admit that 
when I  use your patch, there is no problem with dependencies.

This patch needs opinion of an another reviewer.

[1] 
https://www.postgresql.org/message-id/425db134-8bba-005c-b59d-56e50de3b41e%40postgrespro.ru
[2] 
https://www.postgresql.org/message-id/f49bb262-d246-829d-f835-3950ddac503c%40postgrespro.ru


--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company



Re: don't create storage when unnecessary

2018-12-17 Thread Michael Paquier
On Sun, Dec 16, 2018 at 05:47:16PM -0300, Alvaro Herrera wrote:
> I don't follow.  When a relfilenode is passed by callers, they indicate
> that the storage has already been created.  Contrariwise, when a
> relation kind that *does* have storage but is not yet created, they
> pass InvalidOid as relfilenode, and heap_create is in charge of creating
> storage.  Maybe I'm not quite seeing what problem you mean.  Or I could
> add a separate boolean, but that seems pointless.

I have been double-checking my thoughts on the matter, and one take is
to allow the reuse of relfilenodes for indexes like in TryReuseIndex().
I did not recall that case.  Sorry for the noise.

> Another possible improvement is to remove the create_storage boolean

Yes, this could go away as this is linked with relfilenode.  I let it up
to you if you want to remove it or keep it.  I think that I would just
remove it.

> Added a test in sanity_check.sql that there's no relation with the
> relkinds that aren't supposed to have storage.  Without the code fix it
> fails in current regression database, but in the failure result set
> there isn't any relation of kinds 'p' or 'I', so this isn't a terribly
> comprehensive test -- the query runs too early in the regression
> sequence.  I'm not sure it's worth bothering further.

+-- check that relations without storage don't have relfilenode
It could be an idea to add a comment mentioning that the set of relkinds
in RELKIND_CAN_HAVE_STORAGE are linked with the relkinds of this query.
--
Michael


signature.asc
Description: PGP signature


Re: don't create storage when unnecessary

2018-12-17 Thread Kyotaro HORIGUCHI
Hello.

At Sun, 16 Dec 2018 17:47:16 -0300, Alvaro Herrera  
wrote in <20181216204716.jddzijk3fb2dpdk4@alvherre.pgsql>
> On 2018-Dec-07, Michael Paquier wrote:
> 
> > A macro makes sense to control that.
> 
> I added Ashutosh's RELKIND_HAS_STORAGE, but renamed it to
> RELKIND_CAN_HAVE_STORAGE, because some of the relkinds can be mapped and
> thus would have relfilenode set to 0.  I think this is a bit misleading
> either way.

FWIW.. I RELKIND_HAS_STORAGE looks better to me.

# Since it's a bit too late, I don't insist on that.

Mapped relations have storage, which is signalled by relfilenode
= 0 and the real file node is given by relation mapper. And it is
actually created at the boostrap time. In this sense,
(RELKIND_HAS_STORAGE && relfilenode == 0) in heap_craete makes
sense.

Assertion (..->relNode != InvalidOid) at the end of
RelationInitPhysicalAddr doesn't fail. I think RELKIND_HAS_STORGE
is preferable also in this sense.

ATExecSetTableSpaceNoStorage assumes that the relation is
actually having storage.

> > Now I have to admit that I don't
> > like your solution.  Wouldn't it be cleaner to assign InvalidOid to
> > relfilenode in such cases?  The callers of heap_create would need to be
> > made smarter when they now pass down a relfilenode (looking at you,
> > DefineIndex!), but that seems way more consistent to me.
> 
> I don't follow.  When a relfilenode is passed by callers, they indicate
> that the storage has already been created.  Contrariwise, when a
> relation kind that *does* have storage but is not yet created, they
> pass InvalidOid as relfilenode, and heap_create is in charge of creating
> storage.  Maybe I'm not quite seeing what problem you mean.  Or I could
> add a separate boolean, but that seems pointless.
> 
> Another possible improvement is to remove the create_storage boolean 
> 
> > Some tests would also be welcome.
> 
> Added a test in sanity_check.sql that there's no relation with the
> relkinds that aren't supposed to have storage.  Without the code fix it
> fails in current regression database, but in the failure result set
> there isn't any relation of kinds 'p' or 'I', so this isn't a terribly
> comprehensive test -- the query runs too early in the regression
> sequence.  I'm not sure it's worth bothering further.

Actual files were not created even in the past, create_heap has
been just refactored, which looks fine. pg_class and relcache
entries no longer have false relfilenode, which looks fine. The
test should check only relfilenode won't be given to such
relation, which were given in the past, which looks fine.

The patch applies cleanly and passed the regression test.

regares.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist

2018-12-17 Thread Andrey Lepikhov




On 30.11.2018 15:10, Dmitry Dolgov wrote:

Thank you. I played a bit with this patch, and can confirm visible WAL size
reduction (it's rather obvious, but still). Although, probably due to lack of
my knowledge here, I wonder how does it work when an index is build
concurrently?


Routine generate_xlog_for_rel() works only at phase 2 of concurrent 
index building.
At the phase 3, during validate_index() execution we use aminsert() -> 
PlaceToPage() mechanism as before the patch.

In the concurrent mode, I do not expect any problems, caused by the patch.




Benchmarks:
---

Test: pgbench -f gin-WAL-test.sql -t 5:
---
master:
Latency average: 27696.299 ms
WAL size: 2.66 GB


Does it makes sense to measure latency of the entire script, since it contains
also some preparation work? Of course it still shows the difference between
patched version and master, but probably in a more noisy way.


Ok. It is used only for demonstration.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company



Re: New function pg_stat_statements_reset_query() to reset statistics of a specific query

2018-12-17 Thread Amit Kapila
On Wed, Dec 12, 2018 at 3:54 PM Haribabu Kommi  wrote:
>
> On Thu, Nov 29, 2018 at 1:57 PM Amit Kapila  wrote:
>> >
>>
>> Agreed.  Hari, can you change the patch as per the requirements of option-4.
>
>
> Apologies for delay. Thanks to all your opinions.
>
> Attached is the updated patch as per the option-4 and added a test also to 
> ensure
> the function strictness.
>

Can you add a few examples in the documentation?  See if it is
possible to extend the existing documentation section (F.29.4),
otherwise, add a new section.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's

2018-12-17 Thread Tom Lane
James Coleman  writes:
> Is there anything I can do to solicit reviewers for the current CF?

There is no active CF, which is why nothing is happening.  We'll
start looking at the 2019-01 items in January.  Right now, people
are either on vacation or working on their own patches.

regards, tom lane



Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's

2018-12-17 Thread James Coleman
Is there anything I can do to solicit reviewers for the current CF?

The patch is quite small and should be fairly simple to review.

- James



RE: [suggestion]support UNICODE host variables in ECPG

2018-12-17 Thread Tsunakawa, Takayuki
From: Nagaura, Ryohei [mailto:nagaura.ryo...@jp.fujitsu.com]
> There is a demand for ECPG to support UNICODE host variables.
> This topic has also appeared in thread [1].
> I would like to discuss whether to support in postgres.
> 
> Do you have any opinion?

* What's the benefit of supporting UTF16 in host variables?
* Does your proposal comply with the SQL standard?  If not, what does the SQL 
standard say about support for UTF16?
* Why only Windows?


Regards
Takayuki Tsunakawa





Re: 'infinity'::Interval should be added

2018-12-17 Thread Tom Lane
Isaac Morland  writes:
> On Mon, 17 Dec 2018 at 18:00, Tom Lane  wrote:
>> tl;dr: we should model it after the behavior of IEEE float infinities,
>> except we'll want to throw errors where those produce NaNs.

> Would it be OK to return NULL for ∞ - ∞?

IMO, no.  The only thing worse than inventing NaN for intervals would be
trying to use NULL as a substitute for it.  That'd amount to taking all
the semantic problems IEEE-arithmetic NaNs already have, and mixing
them irretrievably with SQL NULL's semantic problems (which are related
but different).

> Also am I right to assume that -infinity would use -INT_MAX, etc.? Or
> possibly -INT_MAX - 1?

The latter, which is why I mentioned INT_MIN.  There must not be any
finite value that would be less than -infinity, so if you don't use
INT_MIN for -infinity then you're just throwing away one useful
bitpattern.

(Conceivably, if we decided we did want NaN for intervals, we'd define
that as being INT_MIN and -infinity as being -INT_MAX.  But I really think
we don't want to go there, especially since we have not got NaN for
timestamps.  All these related datatypes need to be making similar
decisions, or we're just creating messy new edge cases.)

regards, tom lane



Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Amit Langote
On 2018/12/18 10:57, Michael Paquier wrote:
> On Mon, Dec 17, 2018 at 07:49:42PM +0900, Michael Paquier wrote:
>> Okay, this suggestion sounds fine to me.  Thanks!
> 
> And committed with all your suggestions included.  Thanks for the
> discussion.

Thank you!

Regards,
Amit




Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 07:49:42PM +0900, Michael Paquier wrote:
> Okay, this suggestion sounds fine to me.  Thanks!

And committed with all your suggestions included.  Thanks for the
discussion.
--
Michael


signature.asc
Description: PGP signature


Re: 'infinity'::Interval should be added

2018-12-17 Thread Isaac Morland
On Mon, 17 Dec 2018 at 18:00, Tom Lane  wrote:


> > so I was thinking that
> > postgres=# select 'infinity'::timestamp - 'infinity'::timestamp;
> > would be zero rather than an error, for least surprise.
>
> Wrong.  This case needs to be undefined, because "infinity"
> isn't a specific value.  That's what makes it okay to define, say,
> infinity plus any finite value as infinity.  There are very
> well-defined rules about how to calculate with infinity, and
> not following them is not the way to proceed here.
>
> tl;dr: we should model it after the behavior of IEEE float infinities,
> except we'll want to throw errors where those produce NaNs.
>

Would it be OK to return NULL for ∞ - ∞? Then anybody who wanted 0 could
get it with coalesce (although I think this is a worse idea than anybody
who wants it probably realizes), and anybody who wanted the calculation to
continue on would just get a NULL propagating.

Also am I right to assume that -infinity would use -INT_MAX, etc.? Or
possibly -INT_MAX - 1?


Re: Should new partitions inherit their tablespace from their parent?

2018-12-17 Thread Alvaro Herrera
On 2018-Dec-18, Michael Paquier wrote:

> On Tue, Dec 18, 2018 at 09:36:17AM +1300, David Rowley wrote:
> > On Tue, 18 Dec 2018 at 08:21, Alvaro Herrera  
> > wrote:
> >> Pushed with those changes.  Thanks for the patch!
> > 
> > Thanks for committing it and thanks for reviewing it, Michael.
> 
> Perhaps too early to speak, tests of sepgsql are broken after this
> commit as per rhinoceros report:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros=2018-12-18%2000%3A45%3A01

Ah, thanks.

I think we need to accept the new output.  It is saying that previously
we would not invoke the hook on SET TABLESPACE for a partitioned table,
which makes sense, and now we do invoke it, which also makes sense.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Should new partitions inherit their tablespace from their parent?

2018-12-17 Thread Michael Paquier
On Tue, Dec 18, 2018 at 09:36:17AM +1300, David Rowley wrote:
> On Tue, 18 Dec 2018 at 08:21, Alvaro Herrera  wrote:
>> Pushed with those changes.  Thanks for the patch!
> 
> Thanks for committing it and thanks for reviewing it, Michael.

Perhaps too early to speak, tests of sepgsql are broken after this
commit as per rhinoceros report:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros=2018-12-18%2000%3A45%3A01
--
Michael


signature.asc
Description: PGP signature


Re: 'infinity'::Interval should be added

2018-12-17 Thread Andres Freund
Hi,

On 2018-12-15 11:34:10 -0800, Andres Freund wrote:
> [1] We should optimize the computation using pg_add_s{32,64}_overflow,
> the generated code after that change is *substantially* better (as
> well as the original code much shorter, and correct).  And consider,
> as mentioned further up, moving throwing the error out of line.

I don't have the focus to pursue this right now, so here's the changes
I'd made (for either somebody else to pick up, or as an archive for
myself).

Greetings,

Andres Freund
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index b377c38c8af..db4481bb5d1 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -23,6 +23,7 @@
 #include "access/hash.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/int128.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
@@ -3028,19 +3029,9 @@ interval_um(PG_FUNCTION_ARGS)
 
 	result = (Interval *) palloc(sizeof(Interval));
 
-	result->time = -interval->time;
-	/* overflow check copied from int4um */
-	if (interval->time != 0 && SAMESIGN(result->time, interval->time))
-		ereport(ERROR,
-(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
- errmsg("interval out of range")));
-	result->day = -interval->day;
-	if (interval->day != 0 && SAMESIGN(result->day, interval->day))
-		ereport(ERROR,
-(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
- errmsg("interval out of range")));
-	result->month = -interval->month;
-	if (interval->month != 0 && SAMESIGN(result->month, interval->month))
+	if (pg_sub_s32_overflow(0, interval->month, >month) ||
+		pg_sub_s32_overflow(0, interval->day, >day) ||
+		pg_sub_s64_overflow(0, interval->time, >time))
 		ereport(ERROR,
 (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
  errmsg("interval out of range")));
@@ -3088,23 +3079,13 @@ interval_pl(PG_FUNCTION_ARGS)
 	result = (Interval *) palloc(sizeof(Interval));
 
 	result->month = span1->month + span2->month;
-	/* overflow check copied from int4pl */
-	if (SAMESIGN(span1->month, span2->month) &&
-		!SAMESIGN(result->month, span1->month))
-		ereport(ERROR,
-(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
- errmsg("interval out of range")));
 
-	result->day = span1->day + span2->day;
-	if (SAMESIGN(span1->day, span2->day) &&
-		!SAMESIGN(result->day, span1->day))
-		ereport(ERROR,
-(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
- errmsg("interval out of range")));
+	StaticAssertExpr(sizeof(result->time) == sizeof(int64),
+	 "adjust me");
 
-	result->time = span1->time + span2->time;
-	if (SAMESIGN(span1->time, span2->time) &&
-		!SAMESIGN(result->time, span1->time))
+	if (pg_add_s32_overflow(span1->month, span2->month, >month) ||
+		pg_add_s32_overflow(span1->day, span2->day, >day) ||
+		pg_add_s64_overflow(span1->time, span2->time, >time))
 		ereport(ERROR,
 (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
  errmsg("interval out of range")));
@@ -3121,24 +3102,10 @@ interval_mi(PG_FUNCTION_ARGS)
 
 	result = (Interval *) palloc(sizeof(Interval));
 
-	result->month = span1->month - span2->month;
-	/* overflow check copied from int4mi */
-	if (!SAMESIGN(span1->month, span2->month) &&
-		!SAMESIGN(result->month, span1->month))
-		ereport(ERROR,
-(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
- errmsg("interval out of range")));
 
-	result->day = span1->day - span2->day;
-	if (!SAMESIGN(span1->day, span2->day) &&
-		!SAMESIGN(result->day, span1->day))
-		ereport(ERROR,
-(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
- errmsg("interval out of range")));
-
-	result->time = span1->time - span2->time;
-	if (!SAMESIGN(span1->time, span2->time) &&
-		!SAMESIGN(result->time, span1->time))
+	if (pg_sub_s32_overflow(span1->month, span2->month, >month) ||
+		pg_sub_s32_overflow(span1->day, span2->day, >day) ||
+		pg_sub_s64_overflow(span1->time, span2->time, >time))
 		ereport(ERROR,
 (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
  errmsg("interval out of range")));


Re: ALTER INDEX ... ALTER COLUMN not present in dump

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 02:24:08PM -0500, Tom Lane wrote:
> I eyeballed the patch, and it seems reasonable to me as well.  The test
> case could perhaps be made more extensive (say, a multicolumn index
> with a mix of columns with and without stats targets) but I'm not sure
> that it's worth the trouble.

Thanks all for the lookup and input!  I have added a test pattern more
complex with multiple columns, and committed it down to v11.
--
Michael


signature.asc
Description: PGP signature


Re: ATTACH/DETACH PARTITION CONCURRENTLY

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 06:52:51PM -0300, Alvaro Herrera wrote:
> On 2018-Dec-17, Robert Haas wrote:
> This patch missing the CF deadline would not be a happy way for me to
> begin the new year.
> 
> I'm not sure what's the best way to move forward with this patch, but I
> encourage you to post whatever version you have before the deadline,
> even if you're not fully happy with it (and, heck, even if it doesn't
> compile and/or is full of FIXME or TODO comments).

Agreed.  This patch has value, and somebody else could always take it
from the point where you were.
--
Michael


signature.asc
Description: PGP signature


Re: explain plans with information about (modified) gucs

2018-12-17 Thread Andres Freund
Hi,

On 2018-12-18 00:38:16 +0100, Tomas Vondra wrote:
> On 12/17/18 11:16 PM, Tom Lane wrote:
> > Tomas Vondra  writes:
> >> Yeah, I've been thinking about that too. Currently it gets filtered out
> >> because it's in the CLIENT_CONN_STATEMENT group, but the code only
> >> includes QUERY_TUNING_*.
> >> But we don't want to include everything from CLIENT_CONN_STATEMENT,
> >> because that would include various kinds of timeouts, vacuuming
> >> parameters, etc.
> >> And the same issue applies to work_mem, which is in RESOURCES_MEM. And
> >> of course, there's a lot of unrelated stuff in RESOURCES_MEM.
> >> So I guess we'll need to enable QUERY_TUNING_* and then selectively a
> >> couple of individual options from other groups.
> > 
> > This is putting way too much policy into the mechanism, if you ask me.
> > 
> 
> Doesn't that depend on how it'd be implemented? I have not envisioned to
> make these decisions in explain.c, but rather to keep them in guc.c
> somehow. Say in a function like this:
> 
> bool guc_affects_query_planning(config_generic *conf);
> 
> which would be a fairly simple check outlined before (QUERY_TUNING_*
> plus a couple of individual GUCs). Other use cases might provide similar
> filters.

If we were to do that, I'd suggest implementing a GUC_* flag specified
in the definition of the GUC, rather than a separate function containing
all the knowledge (but such a function could obviously still be used to
return such a fact).


> An alternative would be to somehow track this in the GUC definitions
> directly (similarly to how we track the group), but that seems rather
> inflexible and I'm not sure how would that handle ad-hoc use cases.

Not sure what problem you see here?

Greetings,

Andres Freund



Re: Copypasta in the PostgreSQL source

2018-12-17 Thread Andres Freund
Hi,

On 2018-12-18 00:36:55 +0100, David Fetter wrote:
> On Mon, Dec 17, 2018 at 05:31:22PM -0500, Tom Lane wrote:
> > David Fetter  writes:
> > > Please find attached a run of a tool that looks for duplicated
> > > tokens.  I've removed some things that seem like false positives,
> > > basically all from the stemmer part of the source, but there's
> > > still a lot.
> > 
> > I thought you were talking about problems like "that that" typos,
> > but on looking at the file, what this is actually complaining about
> > is any duplicated code segments anywhere.  I do not find this
> > helpful.  Refactoring to the point that dozen-line code stanzas
> > never appear more than once would be incredibly invasive, likely
> > very bad for performance, and I don't think it'd improve readability
> > either.
> 
> Is there a threshold, possibly much larger than a dozen lines, where
> such refactoring would actually make sense?

Sure, sometimes. But it seems unlikely that a report based on the
technology at hand here would be useful. If none of the code has changed
in recent times, it's not that usually worthy of a lot of attention to
adapt, unless we're talking much larger pieces of code. And a fair bit
of what's pointed out in that report is well known (e.g. that ecpg
duplicates a lot of code in a horrible manner).  I think to be useful
it'd need to actually point out very similar code, that's starting to
diverge further - but that's obviously a much harder thing to achieve.


> > > - Would it make sense to make some kind of git commit trigger that
> > > at least warns when a new one has been introduced?
> > 
> > Commit triggers are NOT the place for heuristics about code quality,
> > even if they're well-considered heuristics.
> 
> Where's a better place for such heuristics?  I'd like to think that
> there are opportunities for more automation than we currently have, on
> that score.  Maybe a `make` target?

If we find useful targets, we can integrate them that way. I don't
consider this to be a worthy case though.

Greetings,

Andres Freund



Re: Copypasta in the PostgreSQL source

2018-12-17 Thread David Fetter
On Mon, Dec 17, 2018 at 05:31:22PM -0500, Tom Lane wrote:
> David Fetter  writes:
> > Please find attached a run of a tool that looks for duplicated
> > tokens.  I've removed some things that seem like false positives,
> > basically all from the stemmer part of the source, but there's
> > still a lot.
> 
> I thought you were talking about problems like "that that" typos,
> but on looking at the file, what this is actually complaining about
> is any duplicated code segments anywhere.  I do not find this
> helpful.  Refactoring to the point that dozen-line code stanzas
> never appear more than once would be incredibly invasive, likely
> very bad for performance, and I don't think it'd improve readability
> either.

Is there a threshold, possibly much larger than a dozen lines, where
such refactoring would actually make sense?

> > - Would it make sense to make some kind of git commit trigger that
> > at least warns when a new one has been introduced?
> 
> Commit triggers are NOT the place for heuristics about code quality,
> even if they're well-considered heuristics.

Where's a better place for such heuristics?  I'd like to think that
there are opportunities for more automation than we currently have, on
that score.  Maybe a `make` target?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: explain plans with information about (modified) gucs

2018-12-17 Thread Tomas Vondra



On 12/17/18 11:16 PM, Tom Lane wrote:
> Tomas Vondra  writes:
>> On 12/17/18 10:56 PM, legrand legrand wrote:
>>> what would you think about adding
>>> search_path
>>> to that list ?
> 
>> Yeah, I've been thinking about that too. Currently it gets filtered out
>> because it's in the CLIENT_CONN_STATEMENT group, but the code only
>> includes QUERY_TUNING_*.
>> But we don't want to include everything from CLIENT_CONN_STATEMENT,
>> because that would include various kinds of timeouts, vacuuming
>> parameters, etc.
>> And the same issue applies to work_mem, which is in RESOURCES_MEM. And
>> of course, there's a lot of unrelated stuff in RESOURCES_MEM.
>> So I guess we'll need to enable QUERY_TUNING_* and then selectively a
>> couple of individual options from other groups.
> 
> This is putting way too much policy into the mechanism, if you ask me.
> 

Doesn't that depend on how it'd be implemented? I have not envisioned to
make these decisions in explain.c, but rather to keep them in guc.c
somehow. Say in a function like this:

bool guc_affects_query_planning(config_generic *conf);

which would be a fairly simple check outlined before (QUERY_TUNING_*
plus a couple of individual GUCs). Other use cases might provide similar
filters.

An alternative would be to somehow track this in the GUC definitions
directly (similarly to how we track the group), but that seems rather
inflexible and I'm not sure how would that handle ad-hoc use cases.

> At least for the auto_explain use case, it'd make far more sense for
> the user to be able to specify which GUCs he wants the query space
> to be divided according to.  While it's possible to imagine giving
> auto_explain a control setting that's a list of GUC names, I'm not
> sure how we adapt the idea for other use-cases.  But the direction
> you're headed here will mostly ensure that nobody is happy.
> 

I certainly don't want to base this on explicitly listing "interesting"
GUCs anywhere. That would make it pretty useless for the use case I care
about, i.e. using auto_explain to investigate slow plans, when I don't
really know what GUC the application might have changed (certainly not
in advance).

I can't really say how to adopt this to other use cases, considering
there are none proposed (and I can't think of any either).


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: gist microvacuum doesn't appear to care about hot standby?

2018-12-17 Thread Alexander Korotkov
On Mon, Dec 17, 2018 at 3:35 PM Alexander Korotkov
 wrote:
> On Mon, Dec 17, 2018 at 3:40 AM Alexander Korotkov
>  wrote:
> > On Mon, Dec 17, 2018 at 1:25 AM Andres Freund  wrote:
> > > On 2018-12-17 01:03:52 +0300, Alexander Korotkov wrote:
> > > > Sorry for delay.  Attached patch implements conflict handling for gist
> > > > microvacuum like btree and hash.  I'm going to push it if no
> > > > objections.
> > > >
> > > > Note, that it implements new WAL record type.  So, new WAL can\t be
> > > > replayed on old minor release.  I'm note sure if we claim that it's
> > > > usually possible.  Should we state something explicitly for this case?
> > >
> > > Please hold off committing for a bit. Adding new WAL records in a minor
> > > release ought to be very well considered and a measure of last resort.
> > >
> > > Couldn't we determine the xid horizon on the primary, and reuse an
> > > existing WAL record to trigger the conflict?  Or something along those
> > > lines?
> >
> > I thought about that, but decided it's better to mimic B-tree and hash
> > behavior rather than invent new logic in a minor release.  But given
> > that new WAL record in minor release in substantial problem, that
> > argument doesn't matter.
> >
> > Yes, it seems to be possible.  We can determine xid horizon on primary
> > in the same way you proposed for B-tree and hash [1] and use
> > XLOG_HEAP2_CLEANUP_INFO record to trigger the conflict.  Do you like
> > me to make such patch for GiST based on your patch?
>
> Got another tricky idea.  Now, deleted offset numbers are written to
> buffer data.  We can also append them to record data.  So, basing on
> record length we can resolve conflicts when offsets are provided in
> record data.  Unpatched version will just ignore extra record data
> tail.  That would cost us some redundant bigger wal records, but solve
> other problems.  Any thoughts?

Please, find backpatch version of patch implementing this approach
attached.  I found it more attractive than placing xid horizon
calculation to primary.  Because xid horizon calculation on primary is
substantially new behavior, which is unwanted for backpatching.  I've
not yet tested this patch.

I'm going to test this patch including WAL compatibility.  If
everything will be OK, then commit.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


gist-microvacuum-conflict-handling-backpatch.patch
Description: Binary data


Re: 'infinity'::Interval should be added

2018-12-17 Thread Tom Lane
Simon Riggs  writes:
> I would like to represent 'infinity' as interval->months = INT_MAX

Per the discussion upthread, what we need when creating an
infinite value is to set
month = INT32_MAX, day = INT32_MAX, time = INT64_MAX (for +Infinity)
month = INT32_MIN, day = INT32_MIN, time = INT64_MIN (for -Infinity)
else we'll need to put special-case logic into comparisons.

However, I think that we only need to restrict ordinary values from
not having month equal to INT32_MIN/MAX, that is the full range
of the other two fields can still be available for normal use.
So when *testing* for an infinity, you would only need to look at the
month field.

> Currently
> postgres=# select 'infinity'::timestamp = 'infinity'::timestamp;
>  ?column?
> --
>  t

Correct, we'd need that here too, if only because btree indexes and
sorting require the reflexive axiom to hold.

> so I was thinking that
> postgres=# select 'infinity'::timestamp - 'infinity'::timestamp;
> would be zero rather than an error, for least surprise.

Wrong.  This case needs to be undefined, because "infinity"
isn't a specific value.  That's what makes it okay to define, say,
infinity plus any finite value as infinity.  There are very
well-defined rules about how to calculate with infinity, and
not following them is not the way to proceed here.

tl;dr: we should model it after the behavior of IEEE float infinities,
except we'll want to throw errors where those produce NaNs.

regards, tom lane



Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Kevin Grittner
On Mon, Dec 17, 2018 at 8:32 AM Corey Huinker  wrote:

> All suggestions are appreciated.

As I recall, the main issue is how to handle concurrency.  The
existing RI triggers do some very special handling of the multi-xact
ID to manage concurrent modifications.  Has anyone looked at the
issues with using those techniques with set-oriented statements for
the transition table approach?

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/



Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Kevin Grittner
On Mon, Dec 17, 2018 at 11:27 AM Alvaro Herrera
 wrote:
> On 2018-Dec-17, Pavel Stehule wrote:
>
>> ROW trigger call RI check too often, and statement trigger too less. I
>> think so ideal design can be call RI check every 10K rows. I think so can
>> be unfriendly if somebody does very long import and it fails on the end. I
>> don't think so there should not be any performance difference, if RI check
>> is called per 1000 or more rows.
>
> This is a good point, but I'm not sure if it's possible to implement
> using statement-level triggers.  I think the way transition tables work
> is that you get the full results at the end of the command; there's no
> way to pass control to the RI stuff at arbitrary points during the
> execution of the command.
>
> Is there any guidance on the SQL standard about this?  I don't think the
> timing indicators in the standard (IMMEDIATE, DEFERRED) have any say on
> this.  Or do they?

Yes, they do.  *ALL* AFTER triggers fire after the statement
completes, it's a question of whether a particular trigger fires once
for the whole statement or once for each row.  Observe:

test=# CREATE TABLE t1 (t1id int PRIMARY KEY, t1desc text);
CREATE TABLE
test=# CREATE TABLE t2 (t2id int PRIMARY KEY, t1id int NOT NULL, t2desc text,
test(#   FOREIGN KEY (t1id) REFERENCES t1);
CREATE TABLE
test=# CREATE FUNCTION t2_insert_func()
test-#   RETURNS TRIGGER
test-#   LANGUAGE plpgsql
test-# AS $$
test$# BEGIN
test$#   RAISE NOTICE '%', new;
test$#   RETURN new;
test$# END;
test$# $$;
CREATE FUNCTION
test=# CREATE TRIGGER t2_insert_trig
test-#   BEFORE INSERT ON t2
test-#   FOR EACH ROW
test-#   EXECUTE FUNCTION t2_insert_func();
CREATE TRIGGER
test=# INSERT INTO t1 VALUES (1), (2), (3);
INSERT 0 3
test=# INSERT INTO t2 VALUES (10, 1), (20, 2), (30, 3), (40, 4), (50, 5);
NOTICE:  (10,1,)
NOTICE:  (20,2,)
NOTICE:  (30,3,)
NOTICE:  (40,4,)
NOTICE:  (50,5,)
ERROR:  insert or update on table "t2" violates foreign key constraint
"t2_t1id_fkey"
DETAIL:  Key (t1id)=(4) is not present in table "t1".

All inserts occur before the statement fails, per standard.

Kevin Grittner
VMware vCenter Server
https://www.vmware.com/



Re: Copypasta in the PostgreSQL source

2018-12-17 Thread Tom Lane
Alvaro Herrera  writes:
> On the other hand, I'm not clear on why do we need four copies of
> number_of_ones.

Yeah, it might be time to move something like that into a common
location.  Not sure where the threshold of pain is, though.

regards, tom lane



Re: 'infinity'::Interval should be added

2018-12-17 Thread Simon Riggs
On Mon, 17 Dec 2018 at 03:48, Tom Lane  wrote:


> The positive argument for adding infinity to interval is that
> we define operations such as timestamp minus timestamp as
> yielding interval.  That's why this has to fail right now:
>
> regression=# select timestamp 'infinity' - timestamp 'now';
> ERROR:  cannot subtract infinite timestamps
>

I would like to represent 'infinity' as interval->months = INT_MAX

The documented maximum for an Interval datatype is 17800 years, which
is 213600 months.
but it is possible to have a higher value (up to 2147483647), since we
don't check inputs.
As a result, it is possible that someone is already storing values above
the stated limits, so this would change behavior for them. But if they were
the net effect of it would be the same, it is still a very, very long
interval. It's not long enough to store useful time intervals for geology
or astrophysics, so I doubt it is used at all for that purpose.

Would there be objection to using the INT_MAX value? If so, what else can
be used?


> Of course, there are still cases like timestamp 'infinity' -
> timestamp 'infinity' that would need to fail, but that has a
> semantic basis rather than "the output type can't represent it".
> (No, I don't want to invent an interval equivalent of NaN
> to make that not fail.)
>

Currently

postgres=# select 'infinity'::timestamp = 'infinity'::timestamp;
 ?column?
--
 t

so I was thinking that

postgres=# select 'infinity'::timestamp - 'infinity'::timestamp;

would be zero rather than an error, for least surprise.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2018-12-17 Thread Tomas Vondra
Hi Alexey,

Thanks for the thorough and extremely valuable review!

On 12/17/18 5:23 PM, Alexey Kondratov wrote:
> Hi Tomas,
> 
>> This new version is mostly just a rebase to current master (or almost,
>> because 2pc decoding only applies to 29180e5d78 due to minor bitrot),
>> but it also addresses the new stuff committed since last version (most
>> importantly decoding of TRUNCATE). It also fixes a bug in WAL-logging of
>> subxact assignments, where the assignment was included in records with
>> XID=0, essentially failing to track the subxact properly.
> 
> I started reviewing your patch about a month ago and tried to do an
> in-depth review, since I am very interested in this patch too. The new
> version is not applicable to master 29180e5d78, but everything is OK
> after applying 2pc patch before. Anyway, I guess it may complicate
> further testing and review, since any potential reviewer has to take
> into account both patches at once. Previous version was applicable to
> master and was working fine for me separately (excepting a few
> patch-specific issues, which I try to explain below).
> 

I agree it's somewhat annoying, but I don't think there's a better way,
unfortunately. Decoding in-progress transactions does require safe
handling of concurrent aborts, so it has to be committed after the 2pc
decoding patch (which makes that possible). But the 2pc patch also
touches the same places as this patch series (it reworks the reorder
buffer for example).

> 
> Patch review
> 
> 
> First of all, I want to say thank you for such a huge work done. Here
> are some problems, which I have found and hopefully fixed with my
> additional patch (please, find attached, it should be applicable to the
> last commit of your newest patch version):
> 
> 1) The most important issue is that your tap tests were broken—there was
> missing option "WITH (streaming=true)" in the subscription creating
> statement. Therefore, spilling mechanism has been tested rather than
> streaming.
> 

D'oh!

> 2) After fixing tests the first one with simple streaming is immediately
> failed, because of logical replication worker segmentation fault. It
> happens, since worker tries to call stream_cleanup_files inside
> stream_open_file at the stream start, while nxids is zero, then it goes
> to the negative value and everything crashes. Something similar may
> happen with xids array, so I added two checks there.
> 
> 3) The next problem is much more critical and is dedicated to historic
> MVCC visibility rules. Previously, walsender was starting to decode
> transaction on commit and we were able to resolve all xmin, xmax,
> combocids to cmin/cmax, build tuplecids hash and so on, but now we start
> doing all these things on the fly.
> 
> Thus, rather difficult situation arises: HeapTupleSatisfiesHistoricMVCC
> is trying to validate catalog tuples, which are currently in the future
> relatively to the current decoder position inside transaction, e.g. we
> may want to resolve cmin/cmax of a tuple, which was created with cid 3
> and deleted with cid 5, while we are currently at cid 4, so our
> tuplecids hash is not complete to handle such a case.
> 

Damn it! I ran into those two issues some time ago and I fixed it, but
I've forgotten to merge that fix into the patch. I'll merge those fixed
and compare them to your proposed fix, and send a new version tomorrow.

> 
> 4) There was a problem with marking top-level transaction as having
> catalog changes if one of its subtransactions has. It was causing a
> problem with DDL statements just after subtransaction start (savepoint),
> so data from new columns is not replicated.
> 
> 5) Similar issue with schema send. You send schema only once per each
> sub/transaction (IIRC), while we have to update schema on each catalog
> change: invalidation execution, snapshot rebuild, adding new tuple cids.
> So I ended up with adding is_schema_send flag to ReorderBufferTXN, since
> it is easy to set it inside RB and read in the output plugin. Probably,
> we have to choose a better place for this flag.
> 

Hmm. Can you share an example how to trigger these issues?

> 6) To better handle all these tricky cases I added new tap
> test—014_stream_tough_ddl.pl—which consist of really tough combination
> of DDL, DML, savepoints and ROLLBACK/RELEASE in a single transaction.
> 

Thanks!

> I marked all my fixes and every questionable place with comment and
> "TOCHECK:" label for easy search. Removing of pretty any of these fixes
> leads to the tests fail due to the segmentation fault or replication
> mismatch. Though I mostly read and tested old version of patch, but
> after a quick look it seems that all these fixes are applicable to the
> new version of patch as well.
> 

Thanks. I'll go through your patch tomorrow.

> 
> Performance
> 
> 
> I have also performed a series of performance tests, and found that
> patch adds a huge overhead in the case of a large transaction consisting
> of many small rows, 

Re: explain plans with information about (modified) gucs

2018-12-17 Thread Tom Lane
Tomas Vondra  writes:
> On 12/17/18 10:56 PM, legrand legrand wrote:
>> what would you think about adding
>> search_path
>> to that list ?

> Yeah, I've been thinking about that too. Currently it gets filtered out
> because it's in the CLIENT_CONN_STATEMENT group, but the code only
> includes QUERY_TUNING_*.
> But we don't want to include everything from CLIENT_CONN_STATEMENT,
> because that would include various kinds of timeouts, vacuuming
> parameters, etc.
> And the same issue applies to work_mem, which is in RESOURCES_MEM. And
> of course, there's a lot of unrelated stuff in RESOURCES_MEM.
> So I guess we'll need to enable QUERY_TUNING_* and then selectively a
> couple of individual options from other groups.

This is putting way too much policy into the mechanism, if you ask me.

At least for the auto_explain use case, it'd make far more sense for
the user to be able to specify which GUCs he wants the query space
to be divided according to.  While it's possible to imagine giving
auto_explain a control setting that's a list of GUC names, I'm not
sure how we adapt the idea for other use-cases.  But the direction
you're headed here will mostly ensure that nobody is happy.

regards, tom lane



Re: explain plans with information about (modified) gucs

2018-12-17 Thread Tomas Vondra
Hi,

On 12/17/18 10:56 PM, legrand legrand wrote:
> what would you think about adding
> search_path
> to that list ?
> 

Yeah, I've been thinking about that too. Currently it gets filtered out
because it's in the CLIENT_CONN_STATEMENT group, but the code only
includes QUERY_TUNING_*.

But we don't want to include everything from CLIENT_CONN_STATEMENT,
because that would include various kinds of timeouts, vacuuming
parameters, etc.

And the same issue applies to work_mem, which is in RESOURCES_MEM. And
of course, there's a lot of unrelated stuff in RESOURCES_MEM.

So I guess we'll need to enable QUERY_TUNING_* and then selectively a
couple of individual options from other groups.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: explain plans with information about (modified) gucs

2018-12-17 Thread legrand legrand
what would you think about adding
search_path
to that list ?

Regards
PAscal





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: ATTACH/DETACH PARTITION CONCURRENTLY

2018-12-17 Thread Alvaro Herrera
On 2018-Dec-17, Robert Haas wrote:

> I have done a bit more work on this, but need to spend some more time
> on it before I have something that is worth posting.  Not sure whether
> I'll get to that before the New Year at this point.

This patch missing the CF deadline would not be a happy way for me to
begin the new year.

I'm not sure what's the best way to move forward with this patch, but I
encourage you to post whatever version you have before the deadline,
even if you're not fully happy with it (and, heck, even if it doesn't
compile and/or is full of FIXME or TODO comments).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Should new partitions inherit their tablespace from their parent?

2018-12-17 Thread David Rowley
On Tue, 18 Dec 2018 at 08:21, Alvaro Herrera  wrote:
> Pushed with those changes.  Thanks for the patch!

Thanks for committing it and thanks for reviewing it, Michael.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: gist microvacuum doesn't appear to care about hot standby?

2018-12-17 Thread Peter Geoghegan
On Sun, Dec 16, 2018 at 4:41 PM Alexander Korotkov
 wrote:
> I thought about that, but decided it's better to mimic B-tree and hash
> behavior rather than invent new logic in a minor release.  But given
> that new WAL record in minor release in substantial problem, that
> argument doesn't matter.

It might be better to mimic B-Tree and hash on the master branch.

-- 
Peter Geoghegan



Re: don't create storage when unnecessary

2018-12-17 Thread Alvaro Herrera
Rebased over today's conflicting commits.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 4d5b82aaa9..b128959cb7 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -324,16 +324,12 @@ heap_create(const char *relname,
 		get_namespace_name(relnamespace), relname),
  errdetail("System catalog modifications are currently disallowed.")));
 
-	/*
-	 * Decide if we need storage or not, and handle a couple other special
-	 * cases for particular relkinds.
-	 */
+	/* Handle reltablespace for specific relkinds. */
 	switch (relkind)
 	{
 		case RELKIND_VIEW:
 		case RELKIND_COMPOSITE_TYPE:
 		case RELKIND_FOREIGN_TABLE:
-			create_storage = false;
 
 			/*
 			 * Force reltablespace to zero if the relation has no physical
@@ -342,17 +338,7 @@ heap_create(const char *relname,
 			reltablespace = InvalidOid;
 			break;
 
-		case RELKIND_PARTITIONED_TABLE:
-		case RELKIND_PARTITIONED_INDEX:
-			/*
-			 * For partitioned tables and indexes, preserve tablespace so that
-			 * it's used as the tablespace for future partitions.
-			 */
-			create_storage = false;
-			break;
-
 		case RELKIND_SEQUENCE:
-			create_storage = true;
 
 			/*
 			 * Force reltablespace to zero for sequences, since we don't
@@ -361,19 +347,21 @@ heap_create(const char *relname,
 			reltablespace = InvalidOid;
 			break;
 		default:
-			create_storage = true;
 			break;
 	}
 
 	/*
-	 * Unless otherwise requested, the physical ID (relfilenode) is initially
-	 * the same as the logical ID (OID).  When the caller did specify a
-	 * relfilenode, it already exists; do not attempt to create it.
+	 * Decide whether to create storage. If caller passed a valid relfilenode,
+	 * storage is already created, so don't do it here.  Also don't create it
+	 * for relkinds without physical storage.
 	 */
-	if (OidIsValid(relfilenode))
+	if (!RELKIND_CAN_HAVE_STORAGE(relkind) || OidIsValid(relfilenode))
 		create_storage = false;
 	else
+	{
+		create_storage = true;
 		relfilenode = relid;
+	}
 
 	/*
 	 * Never allow a pg_class entry to explicitly specify the database's
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index c3071db1cd..a4c8a9e385 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1253,6 +1253,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
 static void
 RelationInitPhysicalAddr(Relation relation)
 {
+	/* these relations kinds never have storage */
+	if (!RELKIND_CAN_HAVE_STORAGE(relation->rd_rel->relkind))
+		return;
+
 	if (relation->rd_rel->reltablespace)
 		relation->rd_node.spcNode = relation->rd_rel->reltablespace;
 	else
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 2217081dcc..393edd33d2 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -450,13 +450,12 @@ typedef struct ViewOptions
 
 /*
  * RelationIsMapped
- *		True if the relation uses the relfilenode map.
- *
- * NB: this is only meaningful for relkinds that have storage, else it
- * will misleadingly say "true".
+ *		True if the relation uses the relfilenode map.  Note multiple eval
+ *		of argument!
  */
 #define RelationIsMapped(relation) \
-	((relation)->rd_rel->relfilenode == InvalidOid)
+	(RELKIND_CAN_HAVE_STORAGE((relation)->rd_rel->relkind) && \
+	 ((relation)->rd_rel->relfilenode == InvalidOid))
 
 /*
  * RelationOpenSmgr
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index 009a89fc1a..89537bcfcf 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -224,3 +224,12 @@ SELECT relname, nspname
 -+-
 (0 rows)
 
+-- check that relations without storage don't have relfilenode
+SELECT relname, relkind
+  FROM pg_class
+ WHERE relkind IN ('v', 'c', 'f', 'p', 'I')
+   AND relfilenode <> 0;
+ relname | relkind 
+-+-
+(0 rows)
+
diff --git a/src/test/regress/sql/sanity_check.sql b/src/test/regress/sql/sanity_check.sql
index a2feebc91b..a4ec00309f 100644
--- a/src/test/regress/sql/sanity_check.sql
+++ b/src/test/regress/sql/sanity_check.sql
@@ -31,3 +31,9 @@ SELECT relname, nspname
  AND NOT EXISTS (SELECT 1 FROM pg_index i WHERE indrelid = c.oid
  AND indkey[0] = a.attnum AND indnatts = 1
  AND indisunique AND indimmediate);
+
+-- check that relations without storage don't have relfilenode
+SELECT relname, relkind
+  FROM pg_class
+ WHERE relkind IN ('v', 'c', 'f', 'p', 'I')
+   AND relfilenode <> 0;


Re: ALTER INDEX ... ALTER COLUMN not present in dump

2018-12-17 Thread Tom Lane
amul sul  writes:
>> On Mon, Dec 17, 2018 at 11:54 AM Michael Paquier 
>>> So this settles the argument that we had better not do anything before
>>> v11.  Switching the dump code to use column numbers has not proved to be
>>> complicated as only the query and some comments had to be tweaked.
>>> Attached is an updated patch, and I am switching back the patch to
>>> "Needs review" to have an extra pair of eyes look at that in case I
>>> missed something.

> I been through the patch -- looks good and does the expected job as
> discussed.

I eyeballed the patch, and it seems reasonable to me as well.  The test
case could perhaps be made more extensive (say, a multicolumn index
with a mix of columns with and without stats targets) but I'm not sure
that it's worth the trouble.

regards, tom lane



Re: Should new partitions inherit their tablespace from their parent?

2018-12-17 Thread Alvaro Herrera
On 2018-Dec-18, David Rowley wrote:

> 1. Shouldn't you be using the RangeVarGetRelid() macro instead of
> calling RangeVarGetRelidExtended()?

This should have been obvious but I didn't notice.

> 2. In MergeAttributes(), the parentOids list still exists and is
> populated.  This is now only used to determine if the "supers" list
> contains any duplicate Oids. Maybe it's better to rename that list to
> something like "seenOids" to avoid any confusion with the "supers"
> list. Or maybe it's worth thinking of a better way to detect duplicate
> items in the "supers" list.

Good catch.

What I did was move the duplicate detection to the loop doing
RangeVarGetRelid, and remove parentOids from MergeAttributes.

Pushed with those changes.  Thanks for the patch!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Pavel Stehule
po 17. 12. 2018 v 19:19 odesílatel Adam Brusselback <
adambrusselb...@gmail.com> napsal:

> It's something I know I am interested in. For me, I don't really care if
> my statement doesn't cancel until the very end if there is a RI violation.
> The benefit of not having deletes be slow on tables which have others
> referencing it  with a fkey which don't have their own index is huge IMO. I
> have a good number of those type of logging tables where an index is not
> useful 99% of the time, but every once and a while a bulk delete needs to
> happen.
>
> It is far from a premature optimization IMO, it is super useful and
> something I was hoping would happen ever since I heard about transition
> tables being worked on.
>

note: my sentence about premature optimization was related to my idea to
divide RI check per 10K rows.

It would be great if RI check will be faster.


> Just my $0.02.
> -Adam
>


Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Adam Brusselback
It's something I know I am interested in. For me, I don't really care if my
statement doesn't cancel until the very end if there is a RI violation. The
benefit of not having deletes be slow on tables which have others
referencing it  with a fkey which don't have their own index is huge IMO. I
have a good number of those type of logging tables where an index is not
useful 99% of the time, but every once and a while a bulk delete needs to
happen.

It is far from a premature optimization IMO, it is super useful and
something I was hoping would happen ever since I heard about transition
tables being worked on.

Just my $0.02.
-Adam


psql exit status with multiple -c or -f

2018-12-17 Thread Justin Pryzby
Our deployment script failed to notice dozens of commands failed in a
transaction block and I only noticed due to keeping full logs and monitoring
for error_severity>'LOG'.  I would have thought that exit status would be
nonzero had an error occured in an earlier script.

The docs since PG9.6 say:
https://www.postgresql.org/docs/9.6/app-psql.html
|Exit Status
|
|psql returns 0 to the shell if it finished normally, 1 if a fatal error of its
|own occurs (e.g. out of memory, file not found), 2 if the connection to the
|server went bad and the session was not interactive, and 3 if an error occurred
|in a script and the variable ON_ERROR_STOP was set.

d5563d7df94488bf0ab52ac0678e8a07e5b8297e
psql: Support multiple -c and -f options, and allow mixing them.

If there's an error, it returns 1 (although that's not "a fatal error of its
own").

|[pryzbyj@database ~]$ psql ts -c foo 2>/dev/null ; echo $?
|1

But the error is "lost" if another script or -c runs without failing:

|[pryzbyj@database ~]$ psql ts -txqc foo -c SELECT 2>/dev/null ; echo $?
|0

Note, this works as expected:

|[pryzbyj@database ~]$ psql ts -v ON_ERROR_STOP=1 -txqc foo -f /dev/null 
2>/dev/null ; echo $?
|1

Justin



Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Pavel Stehule
po 17. 12. 2018 v 18:27 odesílatel Alvaro Herrera 
napsal:

> On 2018-Dec-17, Pavel Stehule wrote:
>
> > ROW trigger call RI check too often, and statement trigger too less. I
> > think so ideal design can be call RI check every 10K rows. I think so can
> > be unfriendly if somebody does very long import and it fails on the end.
> I
> > don't think so there should not be any performance difference, if RI
> check
> > is called per 1000 or more rows.
>
> This is a good point, but I'm not sure if it's possible to implement
> using statement-level triggers.  I think the way transition tables work
> is that you get the full results at the end of the command; there's no
> way to pass control to the RI stuff at arbitrary points during the
> execution of the command.
>

It was just a idea. When I think about it, I am not sure, if RI check from
statement trigger is not worse when statement related changes are too big.
On second hand, it is premature optimization maybe. We can check
performance on prototype.



> Is there any guidance on the SQL standard about this?  I don't think the
> timing indicators in the standard (IMMEDIATE, DEFERRED) have any say on
> this.  Or do they?
>
> Maybe there is a solution for this.  I think it's worth thinking about,
> even if it's just to say that we won't do it.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Alvaro Herrera
On 2018-Dec-17, Pavel Stehule wrote:

> ROW trigger call RI check too often, and statement trigger too less. I
> think so ideal design can be call RI check every 10K rows. I think so can
> be unfriendly if somebody does very long import and it fails on the end. I
> don't think so there should not be any performance difference, if RI check
> is called per 1000 or more rows.

This is a good point, but I'm not sure if it's possible to implement
using statement-level triggers.  I think the way transition tables work
is that you get the full results at the end of the command; there's no
way to pass control to the RI stuff at arbitrary points during the
execution of the command.

Is there any guidance on the SQL standard about this?  I don't think the
timing indicators in the standard (IMMEDIATE, DEFERRED) have any say on
this.  Or do they?

Maybe there is a solution for this.  I think it's worth thinking about,
even if it's just to say that we won't do it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Minimal logical decoding on standbys

2018-12-17 Thread Petr Jelinek
Hi,

On 12/12/2018 21:41, Andres Freund wrote:
> 
> I don't like the approach of managing the catalog horizon via those
> periodically logged catalog xmin announcements.  I think we instead
> should build ontop of the records we already have and use to compute
> snapshot conflicts.  As of HEAD we don't know whether such tables are
> catalog tables, but that's just a bool that we need to include in the
> records, a basically immeasurable overhead given the size of those
> records.

IIRC I was originally advocating adding that xmin announcement to the
standby snapshot message, but this seems better.

> 
> If we were to go with this approach, there'd be at least the following
> tasks:
> - adapt tests from [2]
> - enforce hot-standby to be enabled on the standby when logical slots
>   are created, and at startup if a logical slot exists
> - fix issue around btree_xlog_delete_get_latestRemovedXid etc mentioned
>   above.
> - Have a nicer conflict handling than what I implemented here.  Craig's
>   approach deleted the slots, but I'm not sure I like that.  Blocking
>   seems more appropriately here, after all it's likely that the
>   replication topology would be broken afterwards.
> - get_rel_logical_catalog() shouldn't be in lsyscache.[ch], and can be
>   optimized (e.g. check wal_level before opening rel etc).
> 
> 
> Once we have this logic, it can be used to implement something like
> failover slots on-top, by having having a mechanism that occasionally
> forwards slots on standbys using pg_replication_slot_advance().
> 

Looking at this from the failover slots perspective. Wouldn't blocking
on conflict mean that we stop physical replication on catalog xmin
advance when there is lagging logical replication on primary? It might
not be too big deal as in that use-case it should only happen if
hs_feedback was off at some point, but just wanted to point out this
potential problem.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: Why aren't we using strsignal(3) ?

2018-12-17 Thread Alvaro Herrera
On 2018-Dec-17, Tom Lane wrote:

> But it looks like
> we could drop the sys_siglist support for an undetectably small penalty:
> even if, somewhere, there's a platform that has sys_siglist[] but not
> strsignal(), it'd just mean that you get only a signal number and have
> to look up its meaning.
> 
> While a dozen lines in pgstrsignal.c certainly are not worth worrying
> over, getting rid of the configure test for sys_siglist[] would save
> some cycles on every build.  So I'm tempted to drop it.  Thoughts?

+1 for nuking it.  configure times grow larger, and there's seldom a
change to make them shorter.  In this case, per your analysis, it
doesn't look like we're losing anything worthwhile.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Why aren't we using strsignal(3) ?

2018-12-17 Thread Abhijit Menon-Sen
At 2018-12-17 11:52:03 -0500, t...@sss.pgh.pa.us wrote:
>
> While a dozen lines in pgstrsignal.c certainly are not worth worrying
> over, getting rid of the configure test for sys_siglist[] would save
> some cycles on every build.  So I'm tempted to drop it.  Thoughts?

Removing it makes sense to me.

-- Abhijit



Re: Why aren't we using strsignal(3) ?

2018-12-17 Thread Tom Lane
Alvaro Herrera  writes:
> On 2018-Dec-16, Tom Lane wrote:
>> I propose to replace all these places with code like
>> 
>>  snprintf(str, sizeof(str),
>>   _("child process was terminated by signal %d: %s"),
>>   WTERMSIG(exitstatus), pg_strsignal(WTERMSIG(exitstatus)));
>> 
>> where pg_strsignal is a trivial wrapper around strsignal() if that
>> exists, else it uses sys_siglist[] if that exists, else it just
>> returns "unrecognized signal".

> LGTM.

Done at a73d08319.  Early returns from the buildfarm show that we have
some systems that have strsignal() but not sys_siglist[], e.g.
damselfly and Noah's AIX herd.  So this patch does provide a useful
increment of portability.  What I'm finding interesting is that there
seem to be no systems that have sys_siglist[] but not strsignal().

Digging around on the net suggests that sys_siglist[] is a BSD-ism
but the BSDs all adopted strsignal() a very long time ago.  For instance,
OpenBSD's man page for it says "The strsignal() function first appeared
in AT System V Release 4 UNIX and was reimplemented for NetBSD 1.0."
OpenBSD has it at least back to OpenBSD 2.2 (oldest man pages that
they have online), while FreeBSD adopted it at FreeBSD 4.0.

There are systems that have neither API (just the old HPUX critters)
so we can't dispense with the wrapper entirely.  But it looks like
we could drop the sys_siglist support for an undetectably small penalty:
even if, somewhere, there's a platform that has sys_siglist[] but not
strsignal(), it'd just mean that you get only a signal number and have
to look up its meaning.

While a dozen lines in pgstrsignal.c certainly are not worth worrying
over, getting rid of the configure test for sys_siglist[] would save
some cycles on every build.  So I'm tempted to drop it.  Thoughts?

regards, tom lane



Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Pavel Stehule
po 17. 12. 2018 v 15:32 odesílatel Corey Huinker 
napsal:

> Back when Pg added statement-level triggers, I was interested in the
> potential promise of moving referential integrity checks to statement-level
> triggers.
>
> The initial conversation, along with Kevin Grittner's POC script (in SQL)
> that showed a potential for a 98% reduction in time spent doing RI checks.
> The original thread is here:
>
>
> https://www.postgresql.org/message-id/CACjxUsM4s9%3DCUmPU4YFOYiD5f%3D2ULVDBjuFSo20Twe7KbUe8Mw%40mail.gmail.com
>
> I dug around in the code, and was rather surprised at how close we already
> are to implementing this. The function RI_Initial_Check() already does a
> left-join query via SPI to look for any invalid data, so if we could just
> replace the near table with the transition table for inserted rows, we'd be
> home free. The function SPI_register_trigger_data() makes the transition
> tables visible to SPI, so I started to wonder why this hadn't be done
> already.
>
> I approached Kevin and Thomas Munro seeking feedback on my approach. I
> also made it into a session at the PgConf.ASIA un-conference, and then
> later with Michael Paquier at that same conference, and the coalesced
> feedback was this:
>
> - the overhead of registering the transition tables probably makes it
> unprofitable for single row inserts
> - the single row overhead is itself significant, so maybe the transition
> tables aren't worse
> - there has been talk of replacing transition tables with an in-memory
> data structure that would be closer to "free" from a startup perspective
> and might even coalesce the transition tables of multiple statements in the
> same transaction
> - because no declarative code changes, it's trivial to switch from row
> level to statement level triggering via pg_upgrade
> - assuming that transition tables are an overhead that only pays off when
> > N rows have been updated, does it make sense to enforce RI with something
> that isn't actually a trigger?
> - there was also some mention that parallel query uses a queue mechanism
> that might be leveraged to do row-level triggers for updates of <= N rows
> and statement level for > N
>
> That's what I have so far. I'm going to be working on a POC patch so that
> I can benchmark a pure-statement-level solution, which if nothing else will
> let us know the approximate value of N.
>
> All suggestions are appreciated.
>

It is great. I though little bit about it - just theoretically.

ROW trigger call RI check too often, and statement trigger too less. I
think so ideal design can be call RI check every 10K rows. I think so can
be unfriendly if somebody does very long import and it fails on the end. I
don't think so there should not be any performance difference, if RI check
is called per 1000 or more rows.

Regards

Pavel


Re: [HACKERS] WIP: Aggregation push-down

2018-12-17 Thread Antonin Houska
Tom Lane  wrote:

> Antonin Houska  writes:
> > [ agg_pushdown_v8.tgz ]
> 
> I spent a few hours looking at this today.

Thanks!

> It seems to me that at no point has there been a clear explanation of what
> the patch is trying to accomplish, so let me see whether I've got it
> straight or not.  (I suggest that something like this ought to be included
> in optimizer/README; the patch's lack of internal documentation is a serious
> deficiency.)

Earlier version of the patch did update optimizer/README but I forgot to merge
it into the current one. I'll fix that in the next version. (Also some more
will be added, especially header comments of some functions.)

> Conceptually, we'd like to be able to push aggregation down below joins
> when that yields a win, which it could do by reducing the number of rows
> that have to be processed at the join stage.  Thus consider
> 
> SELECT agg(a.x) FROM a, b WHERE a.y = b.z;
> 
> We can't simply aggregate during the scan of "a", because some values of
> "x" might not appear at all in the input of the naively-computed aggregate
> (if their rows have no join partner in "b"), or might appear multiple
> times (if their rows have multiple join partners).  So at first glance,
> aggregating before the join is impossible.  The key insight of the patch
> is that we can make some progress by considering only grouped aggregation:

Yes, the patch does not handle queries with no GROUP BY clause. Primarily
because I don't know how the grouped "a" table in this example could emit the
"a.y" column w/o using it as a grouping key.

> if we can group "a" into sets of rows that must all have the same join
> partners, then we can do preliminary aggregation within each such group,
> and take care of the number-of-repetitions problem when we join.  If the
> groups are large then this reduces the number of rows processed by the
> join, at the cost that we might spend time computing the aggregate for
> some row sets that will just be discarded by the join.  So now we consider
> 
> SELECT agg(a.x) FROM a, b WHERE a.y = b.z GROUP BY ?;
> 
> and ask what properties the grouping column(s) "?" must have to make this
> work.  I believe we can say that it's OK to push down through a join if
> its join clauses are all of the form "a.y = b.z", where either a.y or b.z
> is listed as a GROUP BY column *and* the join operator is equality for
> the btree opfamily specified by the SortGroupClause.  (Note: actually,
> SortGroupClause per se mentions an equality operator, although I think the
> planner quickly reduces that to an opfamily specification.)

I suppose you mean make_pathkey_from_sortop().

> The concerns Robert had about equality semantics are certainly vital, but I
> believe that the logic goes through correctly as long as the grouping
> equality operator and the join equality operator belong to the same
> opfamily.

ok, generic approach like this is always better. I just could not devise it,
nor can I prove that your theory is wrong.

> Robert postulated cases like "citext = text", but that doesn't apply
> here because no cross-type citext = text operator could be part of a
> well-behaved opfamily.

The reason I can see now is that the GROUP BY operator (opfamily) essentially
has the grouping column on both sides, so it does not match the cross-type
operator.

> What we'd actually be looking at is either "citextvar::text texteq textvar"
> or "citextvar citexteq textvar::citext", and the casts prevent these
> expressions from matching GROUP BY entries that have the wrong semantics.

Can you please explain this? If the expression is "citextvar::text texteq
textvar", then both operands of the operator should be of "text" type (because
the operator receives the output of the cast), so I'd expect a match to
SortGroupClause.eqop of clause like "GROUP BY ".

> However: what we have proven here is only that we can aggregate across
> a set of rows that must share the same join partners.  We still have
> to be able to handle the case that the rowset has more than one join
> partner, which AFAICS means that we must use partial aggregation and
> then apply an "aggmultifn" (or else apply the aggcombinefn N times).
> We can avoid that and use plain aggregation when we can prove the "b"
> side of the join is unique, so that no sets of rows will have to be merged
> post-join; but ISTM that that reduces the set of use cases to be too small
> to be worth such a complex patch.  So I'm really doubtful that we should
> proceed forward with only that case available.

I tried to follow Heikki's proposal in [1] and separated the functionality
that does not need the partial aggregation. I was not able to foresee that the
patch does not get much smaller this way. Also because I had to introduce
functions that check whether particular join does duplicate aggregated values
or not. (Even if we use partial aggregation, we can use this functionality to
detect that we only need to call aggfinalfn() in some cases, as opposed 

Re: Pluggable Storage - Andres's take

2018-12-17 Thread Dmitry Dolgov
> On Sat, Dec 15, 2018 at 8:37 PM Andres Freund  wrote:
>
> We need to dump the table access method at dump time, otherwise we loose
> that information.

Oh, right. So, something like in the attached patch?

> > As a side note, in a table description I haven't found any mention of which
> > access method is used for this table, probably it's useful to show that 
> > with \d+
> > (see the attached patch).
>
> I'm not convinced that's really worth the cost of including it in \d
> (rather than \d+ or such).

Maybe I'm missing the point, but I've meant exactly the same and the patch,
suggested in the previous email, add this info to \d+


pg_dump_access_method.patch
Description: Binary data


Re: slight tweaks to documentation about runtime pruning

2018-12-17 Thread Amit Langote
On Mon, Dec 17, 2018 at 11:49 PM Alvaro Herrera
 wrote:
> On 2018-Dec-14, Amit Langote wrote:
>
> > I updated the patch.  Regarding whether we should mention "(never
> > executed)", it wouldn't hurt to mention it imho, exactly because it's
> > shown in the place of showing loops=0.  How about the attached?
>
> Pushed, thanks.

Thank you!

Regards,
Amit



Statement-level Triggers For Uniqueness Checks

2018-12-17 Thread Corey Huinker
In digging around the codebase (see thread: Referential Integrity Checks
with Statement-level Triggers), I noticed that unique constraints are
similarly enforced with a per-row trigger.

The situation with unique indexes is fairly similar to the situation with
RI checks: there is some overhead to using a transition table, but that
overhead may be less than the cost of firing a trigger once per row
inserted/updated.

However, there are some significant differences (apologies to everyone
already familiar with this part of the code, it's new to me).

For one, there is no analog to RI_Initial_Check(). Instead the constraint
is initially checked via building/finding the unique index that would
enforce the uniqueness check.

Then, the actual lookup done in unique_key_recheck has to contend with the
intricacies of HOT updates, so I don't know if that can be expressed in an
SPI query. Even if not, I think it should be possible to iterate over
the EphemeralNamedRelation and that would result itself have a payoff in
reduced trigger calls.

I'm going to be working on this as a POC patch separate from the RI work,
hence the separate thread, but there's obviously a lot of overlap.

All advice is appreciated.


Re: slight tweaks to documentation about runtime pruning

2018-12-17 Thread Alvaro Herrera
On 2018-Dec-14, Amit Langote wrote:

> I updated the patch.  Regarding whether we should mention "(never
> executed)", it wouldn't hurt to mention it imho, exactly because it's
> shown in the place of showing loops=0.  How about the attached?

Pushed, thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [External] Re: simple query on why a merge join plan got selected

2018-12-17 Thread Vijaykumar Jain
Thanks a lot Tom, as always :)
We generally do not have so many duplicates in production, so maybe this is
an edge case but I am happy with the explanation and the code reference for
the analysis.
I’ll also play with default statistic target to see what changes by
increasing the value.


On Sun, 16 Dec 2018 at 5:52 AM Tom Lane  wrote:

> Vijaykumar Jain  writes:
> > I was just playing with exploring joins and plans i came across this
> > create table t1(a int);
> > create table t2(a int);
> > insert into t1 select (x % 10) from generate_series(1, 10) x;
> > insert into t2 select (x % 100) from generate_series(1, 10) x;
> > ...
> > select * from t1 join t2 using (a);
>
> Hm.  This is a fairly extreme case for mergejoining.  In the first place,
> because of the disparity in the key ranges (t1.a goes from 0..9, t2.a
> from 0..99) the planner can figure out that a merge join can stop after
> scanning only 10% of t2.  That doesn't help much here, since we still
> have to sort all of t2, but nonetheless the planner is going to take
> that into account.  In the second place, because you have so many
> duplicate values, most rows in t1 will require "rescanning" 1000 rows
> that were already read and joined to the previous row of t1 (assuming
> t1 is on the left of the join; it's worse if t2 is on the left).
>
> The planner estimates each of those situations properly, but it looks
> to me like it is not handling the combination of both effects correctly.
> In costsize.c we've got
>
> /*
>  * The number of tuple comparisons needed is approximately number of
> outer
>  * rows plus number of inner rows plus number of rescanned tuples (can
> we
>  * refine this?).  At each one, we need to evaluate the mergejoin
> quals.
>  */
> startup_cost += merge_qual_cost.startup;
> startup_cost += merge_qual_cost.per_tuple *
> (outer_skip_rows + inner_skip_rows * rescanratio);
> run_cost += merge_qual_cost.per_tuple *
> ((outer_rows - outer_skip_rows) +
>  (inner_rows - inner_skip_rows) * rescanratio);
>
> where outer_rows and inner_rows are the numbers of rows we're predicting
> to actually read from each input, the xxx_skip_rows values are zero for
> this example, and rescanratio was previously computed as
>
> /* We'll inflate various costs this much to account for rescanning */
> rescanratio = 1.0 + (rescannedtuples / inner_path_rows);
>
> where inner_path_rows is the *total* size of the inner relation,
> including rows that we're predicting won't get read because of the
> stop-short effect.
>
> As far as I can tell, that comment's claim about the number of tuple
> comparisons needed is on-target ... but the code is computing a number
> of tuple comparisons 10x less than that.  The reason is that rescanratio
> is wrong: it should be
>
> rescanratio = 1.0 + (rescannedtuples / inner_rows);
>
> instead, so that it's something that makes sense to multiply inner_rows
> by.  In the existing uses of rescanratio, one multiplies it by
> inner_path_rows and needs to be changed to inner_rows to agree with
> this definition, but the other uses are already consistent with this.
>
> This doesn't make a significant difference if either rescannedtuples
> is small, or inner_rows isn't much less than inner_path_rows.  But
> when neither is true, we can greatly underestimate the number of tuple
> comparisons we'll have to do, as well as the number of re-fetches from
> the inner plan node.  I think in practice it doesn't matter that often,
> because in such situations we'd usually not have picked a mergejoin
> anyway.  But in your example the buggy mergejoin cost estimate is about
> 10% less than the hashjoin cost estimate, so we go with mergejoin.
>
> The attached proposed patch fixes this, raising the mergejoin cost
> estimate to about 35% more than the hashjoin estimate, which seems
> a lot closer to reality.  It doesn't seem to change any results in
> the regression tests, which I find unsurprising: there are cases
> like this in the tests, but as I just said, they pick hashjoins
> already.
>
> Also interesting is that after this fix, the estimated costs of a
> mergejoin for this example are about the same whether t1 or t2 is on
> the left.  I think that's right: t2-on-the-left has 10x more rescanning
> to do per outer tuple, but it stops after scanning only 10% of the
> outer relation, canceling that out.
>
> I'm not sure whether to back-patch this.  It's a pretty clear thinko,
> but there's the question of whether we'd risk destabilizing plan
> choices that are working OK in the real world.
>
> regards, tom lane
>
> --

Regards,
Vijay


Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Corey Huinker
Back when Pg added statement-level triggers, I was interested in the
potential promise of moving referential integrity checks to statement-level
triggers.

The initial conversation, along with Kevin Grittner's POC script (in SQL)
that showed a potential for a 98% reduction in time spent doing RI checks.
The original thread is here:

https://www.postgresql.org/message-id/CACjxUsM4s9%3DCUmPU4YFOYiD5f%3D2ULVDBjuFSo20Twe7KbUe8Mw%40mail.gmail.com

I dug around in the code, and was rather surprised at how close we already
are to implementing this. The function RI_Initial_Check() already does a
left-join query via SPI to look for any invalid data, so if we could just
replace the near table with the transition table for inserted rows, we'd be
home free. The function SPI_register_trigger_data() makes the transition
tables visible to SPI, so I started to wonder why this hadn't be done
already.

I approached Kevin and Thomas Munro seeking feedback on my approach. I also
made it into a session at the PgConf.ASIA un-conference, and then later
with Michael Paquier at that same conference, and the coalesced feedback
was this:

- the overhead of registering the transition tables probably makes it
unprofitable for single row inserts
- the single row overhead is itself significant, so maybe the transition
tables aren't worse
- there has been talk of replacing transition tables with an in-memory data
structure that would be closer to "free" from a startup perspective and
might even coalesce the transition tables of multiple statements in the
same transaction
- because no declarative code changes, it's trivial to switch from row
level to statement level triggering via pg_upgrade
- assuming that transition tables are an overhead that only pays off when >
N rows have been updated, does it make sense to enforce RI with something
that isn't actually a trigger?
- there was also some mention that parallel query uses a queue mechanism
that might be leveraged to do row-level triggers for updates of <= N rows
and statement level for > N

That's what I have so far. I'm going to be working on a POC patch so that I
can benchmark a pure-statement-level solution, which if nothing else will
let us know the approximate value of N.

All suggestions are appreciated.


Re: Should new partitions inherit their tablespace from their parent?

2018-12-17 Thread David Rowley
On Mon, 17 Dec 2018 at 11:07, Alvaro Herrera  wrote:
> First, I changed the Assert() to use the macro for relations with
> storage that I just posted in the other thread that Michael mentioned.
>
> I then noticed that we're doing a heap_openrv() in the parent relation
> and closing it before MergeAttribute() does the same thing all over
> again; also MergeAttribute has the silly array-of-OIDs return value for
> the parents so that DefineRelation can handle it again later.  Rube
> Goldberg says hi.  I changed this so that *before* doing anything with
> the parent list, we transform it to a list of OIDs, and lock them; so
> MergeAttributes now receives the list of OIDs of parents rather than
> RangeVars.  We can also move the important comment (about lock level of
> parent rels) buried in the bowels of MergeAttribute to the place where
> it belongs in DefineRelation; and we no longer have to mess with
> transforming names to OIDs multiple times.
>
> Proposed patch attached.

I like this idea. Seems much better than resolving the relation name twice.

I've read over the patch and the only two things that I noted were:

1. Shouldn't you be using the RangeVarGetRelid() macro instead of
calling RangeVarGetRelidExtended()?
2. In MergeAttributes(), the parentOids list still exists and is
populated.  This is now only used to determine if the "supers" list
contains any duplicate Oids. Maybe it's better to rename that list to
something like "seenOids" to avoid any confusion with the "supers"
list. Or maybe it's worth thinking of a better way to detect duplicate
items in the "supers" list.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: ATTACH/DETACH PARTITION CONCURRENTLY

2018-12-17 Thread Robert Haas
On Sun, Dec 16, 2018 at 6:43 AM Michael Paquier  wrote:
> On Sat, Dec 15, 2018 at 01:04:00PM +0300, Sergei Kornilov wrote:
> > Seems we erroneously moved this thread to next CF:
> > https://commitfest.postgresql.org/21/1842/
> > Can you close this entry?
>
> Robert has committed a patch to refactor a bit the list contruction of
> RelationBuildPartitionDesc thanks to 7ee5f88e, but the main patch has
> not been committed, so the current status looks right to me.

I have done a bit more work on this, but need to spend some more time
on it before I have something that is worth posting.  Not sure whether
I'll get to that before the New Year at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove double trailing semicolons

2018-12-17 Thread David Rowley
On Mon, 17 Dec 2018 at 22:10, Amit Kapila  wrote:
> Pushed, one of these goes back till 10.

Thanks.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Problems with plan estimates in postgres_fdw

2018-12-17 Thread Etsuro Fujita

(2018/12/17 22:09), Etsuro Fujita wrote:

Here is a set of WIP patches for pushing down ORDER BY LIMIT to the remote:

* 0001-postgres-fdw-upperrel-ordered-WIP.patch:


Correction: 
0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-WIP.patch



* 0002-postgres-fdw-upperrel-final-WIP.patch:


Correction: 0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-WIP.patch

Best regards,
Etsuro Fujita




Re: Problems with plan estimates in postgres_fdw

2018-12-17 Thread Etsuro Fujita

(2018/10/09 14:48), Etsuro Fujita wrote:

(2018/10/05 19:15), Etsuro Fujita wrote:

(2018/08/02 23:41), Tom Lane wrote:

Andrew Gierth writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]


Yeah, that's basically not accounted for at all in the current design.


One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?


[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the
query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start
plans,
it might make sense to build two scan paths for a foreign table, one
based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.


Agreed, but ISTM that to address the original issue, it would be enough
to jsut add LIMIT (or ORDER BY LIMIT) pushdown to postgres_fdw based on
the upper-planner-pathification work.


Will work on it unless somebody else wants to.


Here is a set of WIP patches for pushing down ORDER BY LIMIT to the remote:

* 0001-postgres-fdw-upperrel-ordered-WIP.patch:
This patch performs the UPPERREL_ORDERED step remotely.

* 0002-postgres-fdw-upperrel-final-WIP.patch:
This patch performs the UPPERREL_FINAL step remotely.  Currently, this 
only supports for SELECT commands, and pushes down LIMIT/OFFSET to the 
remote if possible.  This also removes LockRows if there is a FOR 
UPDATE/SHARE clause, which would be safe because postgres_fdw performs 
early locking.  I'd like to leave INSERT, UPDATE and DELETE cases for 
future work.  It is my long-term todo to rewrite PlanDirectModify using 
the upper-planner-pathification work.  :)


For some regression test cases with ORDER BY and/or LIMIT, I noticed 
that these patches still cannot push down those clause to the remote.  I 
guess it would be needed to tweak the cost/size estimation added by 
these patches, but I didn't look at that in detail yet.  Maybe I'm 
missing something, though.


Comments welcome!

Best regards,
Etsuro Fujita
>From 41b8253a732982aa2183fca5c1a9b23377f21b4d Mon Sep 17 00:00:00 2001
From: Etsuro Fujita 
Date: Mon, 17 Dec 2018 21:16:47 +0900
Subject: [PATCH 1/2] postgres_fdw: Perform UPPERREL_ORDERED step remotely

---
 contrib/postgres_fdw/deparse.c |  28 ++-
 contrib/postgres_fdw/expected/postgres_fdw.out | 152 +---
 contrib/postgres_fdw/postgres_fdw.c| 315 +++--
 contrib/postgres_fdw/postgres_fdw.h|   9 +-
 src/backend/optimizer/plan/planner.c   |   7 +-
 src/include/nodes/relation.h   |  10 +
 6 files changed, 403 insertions(+), 118 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 654323f..499c932 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -168,7 +168,8 @@ static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 static void deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
  deparse_expr_cxt *context);
 static void deparseLockingClause(deparse_expr_cxt *context);
-static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context);
+static void appendOrderByClause(List *pathkeys, bool has_final_sort,
+	deparse_expr_cxt *context);
 static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
 	  RelOptInfo *foreignrel, bool use_alias,
@@ -930,8 +931,8 @@ build_tlist_to_deparse(RelOptInfo *foreignrel)
 void
 deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel,
 		List *tlist, List *remote_conds, List *pathkeys,
-		bool is_subquery, List **retrieved_attrs,
-		List **params_list)
+		bool has_final_sort, bool is_subquery,
+		List **retrieved_attrs, List **params_list)
 {
 	deparse_expr_cxt context;
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private;
@@ -986,7 +987,7 @@ deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel,
 
 	/* Add ORDER BY clause if we found any useful pathkeys */
 	if (pathkeys)
-		appendOrderByClause(pathkeys, );
+		appendOrderByClause(pathkeys, has_final_sort, );
 
 	/* Add any necessary FOR UPDATE/SHARE. */
 	deparseLockingClause();
@@ -1591,7 +1592,7 @@ 

Re: Connection slots reserved for replication

2018-12-17 Thread Alexander Kukushkin
Hi,

On Thu, 6 Dec 2018 at 00:55, Petr Jelinek  wrote:
> > Does excluding WAL senders from the max_connections limit and including 
> > max_wal_senders in MaxBackends also imply that we need to add 
> > max_wal_senders to the list at xlog.c: CheckRequiredParameterValues, 
> > requiring its value on the replica to be not lower than the one on the 
> > primary?
> >
>
> I think it does, we need the proc slots for walsenders on the standby
> same way we do for normal backends.

You are absolutely right. Attaching the new version of the patch.

Regards,
--
Alexander Kukushkin
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index d8fd195da0..36ea513273 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -2172,6 +2172,11 @@ LOG:  database system is ready to accept read only connections
  max_locks_per_transaction
 

+   
+
+ max_wal_senders
+
+   

 
  max_worker_processes
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 8d9d40664b..18665895cb 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -717,13 +717,13 @@ psql: could not connect to server: No such file or directory

 SEMMNI
 Maximum number of semaphore identifiers (i.e., sets)
-at least ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16) plus room for other applications
+at least ceil((max_connections + autovacuum_max_workers + wax_wal_senders + max_worker_processes + 5) / 16) plus room for other applications

 

 SEMMNS
 Maximum number of semaphores system-wide
-ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16) * 17 plus room for other applications
+ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17 plus room for other applications

 

@@ -780,13 +780,13 @@ psql: could not connect to server: No such file or directory
 other applications. The maximum number of semaphores in the system
 is set by SEMMNS, which consequently must be at least
 as high as max_connections plus
-autovacuum_max_workers plus max_worker_processes,
-plus one extra for each 16
+autovacuum_max_workers plus max_wal_senders,
+plus max_worker_processes, plus one extra for each 16
 allowed connections plus workers (see the formula in ).  The parameter SEMMNI
 determines the limit on the number of semaphore sets that can
 exist on the system at one time.  Hence this parameter must be at
-least ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16).
+least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16).
 Lowering the number
 of allowed connections is a temporary workaround for failures,
 which are usually confusingly worded No space
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 00741c7b09..1a304f19b8 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -109,11 +109,12 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 			}
 		}
 
-		appendStringInfo(buf, "max_connections=%d max_worker_processes=%d "
-		 "max_prepared_xacts=%d max_locks_per_xact=%d "
-		 "wal_level=%s wal_log_hints=%s "
-		 "track_commit_timestamp=%s",
+		appendStringInfo(buf, "max_connections=%d max_wal_senders=%d "
+		 "max_worker_processes=%d max_prepared_xacts=%d "
+		 "max_locks_per_xact=%d wal_level=%s "
+		 "wal_log_hints=%s track_commit_timestamp=%s",
 		 xlrec.MaxConnections,
+		 xlrec.max_wal_senders,
 		 xlrec.max_worker_processes,
 		 xlrec.max_prepared_xacts,
 		 xlrec.max_locks_per_xact,
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c80b14ed97..668c7ffcf2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5257,6 +5257,7 @@ BootStrapXLOG(void)
 
 	/* Set important parameter values for use when replaying WAL */
 	ControlFile->MaxConnections = MaxConnections;
+	ControlFile->max_wal_senders = max_wal_senders;
 	ControlFile->max_worker_processes = max_worker_processes;
 	ControlFile->max_prepared_xacts = max_prepared_xacts;
 	ControlFile->max_locks_per_xact = max_locks_per_xact;
@@ -6161,6 +6162,9 @@ CheckRequiredParameterValues(void)
 		RecoveryRequiresIntParameter("max_connections",
 	 MaxConnections,
 	 ControlFile->MaxConnections);
+		RecoveryRequiresIntParameter("max_wal_senders",
+	 max_wal_senders,
+	 ControlFile->max_wal_senders);
 		RecoveryRequiresIntParameter("max_worker_processes",
 	 max_worker_processes,
 	 ControlFile->max_worker_processes);
@@ -9453,6 +9457,7 @@ 

Re: gist microvacuum doesn't appear to care about hot standby?

2018-12-17 Thread Alexander Korotkov
On Mon, Dec 17, 2018 at 3:40 AM Alexander Korotkov
 wrote:
> On Mon, Dec 17, 2018 at 1:25 AM Andres Freund  wrote:
> > On 2018-12-17 01:03:52 +0300, Alexander Korotkov wrote:
> > > Sorry for delay.  Attached patch implements conflict handling for gist
> > > microvacuum like btree and hash.  I'm going to push it if no
> > > objections.
> > >
> > > Note, that it implements new WAL record type.  So, new WAL can\t be
> > > replayed on old minor release.  I'm note sure if we claim that it's
> > > usually possible.  Should we state something explicitly for this case?
> >
> > Please hold off committing for a bit. Adding new WAL records in a minor
> > release ought to be very well considered and a measure of last resort.
> >
> > Couldn't we determine the xid horizon on the primary, and reuse an
> > existing WAL record to trigger the conflict?  Or something along those
> > lines?
>
> I thought about that, but decided it's better to mimic B-tree and hash
> behavior rather than invent new logic in a minor release.  But given
> that new WAL record in minor release in substantial problem, that
> argument doesn't matter.
>
> Yes, it seems to be possible.  We can determine xid horizon on primary
> in the same way you proposed for B-tree and hash [1] and use
> XLOG_HEAP2_CLEANUP_INFO record to trigger the conflict.  Do you like
> me to make such patch for GiST based on your patch?

Got another tricky idea.  Now, deleted offset numbers are written to
buffer data.  We can also append them to record data.  So, basing on
record length we can resolve conflicts when offsets are provided in
record data.  Unpatched version will just ignore extra record data
tail.  That would cost us some redundant bigger wal records, but solve
other problems.  Any thoughts?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: New function pg_stat_statements_reset_query() to reset statistics of a specific query

2018-12-17 Thread Kyotaro HORIGUCHI
Hello. One more weight comes here.

At Sat, 15 Dec 2018 08:03:46 +0530, Amit Kapila  wrote 
in 
> On Sat, Dec 15, 2018 at 12:14 AM Robert Haas  wrote:
> >
> > On Wed, Nov 28, 2018 at 1:43 PM Alvaro Herrera  
> > wrote:
> > > Right, I think option 4 is a clear improvement over option 1.  I can get
> > > behind that one.  Since not many people care to vote, I think this tips
> > > the scales enough to that side.
> >
> > I'm showing up very late to the party here,
> >
> 
> Not a problem, people like you are always welcome.
> 
> > but I like option 1 best.
> >
> 
> You are not alone in this camp, so, IIUC, below are voting by different people
> 
> Option-1: Vik, Sergei, Robert
> Option-2: Alvaro, Magnus
> Option-3: Michael, Hari
> Option-4: Amit, Hari, Magnus, Alvaro, Michael, Peter
> 
> Some people's name is present in two options as they are okay with
> either one of those.  I see that now more people are in favor of
> option-1, but still, the tilt is more towards option-4.
> 
> > I feel like the SQL standard has a pretty clear idea that NULL is how
> > you represent a value is unknown, which shows up in a lot of places.
> > Deciding that we're going to use a different sentinel value in this
> > one case because NULL is a confusing concept in general seems pretty
> > strange to me.
> >
> 
> I agree that NULL seems to be the attractive choice for a default
> value, but we have to also see what it means?  In this case, NULL will
> mean 'all' which doesn't go with generally what NULL means (aka
> unknown, only NULL can be compared with other NULL).  There are a few
> people on this thread who feel that having NULL can lead to misuse of
> this API [1] as explained here and probably we need to use some
> workaround for it to be used in production [2].
> [1] - 
> https://www.postgresql.org/message-id/CAA4eK1LicqWY55XxmahQXti4RjQ28iuASAk1X8%2ByKX0J051_VQ%40mail.gmail.com
> [2] - 
> https://www.postgresql.org/message-id/20181117111653.cetidngkgol5e5xn%40alvherre.pgsql


Although #1 seems cleaner to me, the fear of inadvertent removal
of all queries with quite-common usage wins.  So I vote for #4
for now, supposing pg_stat_statements_reset(0, 0, 0) as
all-clear.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 06:35:03PM +0900, Amit Langote wrote:
> As far as the information content of this comment is concerned, I think
> it'd be more useful to word this comment such that it is applicable to
> different functions than to word it such that it is applicable to
> different queries.  More opinions would be nice.

This argument is sensible.

> Okay, how about:
> 
> -- Various partitioning-related functions return NULL if passed relations
> -- of types that cannot be part of a partition tree; for example, views,
> -- materialized views, etc.

Okay, this suggestion sounds fine to me.  Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: pg_dump multi VALUES INSERT

2018-12-17 Thread Surafel Temesgen
On Fri, Nov 30, 2018 at 7:16 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:

>
> Unfortunately, patch needs to be rebased, could you please post an updated
> version?
>

Thank you for informing, Here is an updated patch against current master
Regards
Surafel
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..d93d8fc0c2 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -766,7 +766,10 @@ PostgreSQL documentation
 non-PostgreSQL databases.
 However, since this option generates a separate command for each row,
 an error in reloading a row causes only that row to be lost rather
-than the entire table contents.
+than the entire table contents. The number of row per insert statement
+can also be specified to make the dump file smaller and faster
+to reload but lack single row data lost on error while reloading rather entire affected
+insert statement data lost.
 Note that
 the restore might fail altogether if you have rearranged column order.
 The --column-inserts option is safe against column
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 637c79af48..6b1440e173 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -358,7 +358,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, _row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, _exists, 1},
-		{"inserts", no_argument, _inserts, 1},
+		{"inserts", optional_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, , 1},
 		{"quote-all-identifiers", no_argument, _all_identifiers, 1},
@@ -557,6 +557,20 @@ main(int argc, char **argv)
 dosync = false;
 break;
 
+			case 8:			/* inserts values number */
+if (optarg)
+{
+	dopt.dump_inserts_multiple = atoi(optarg);
+	if (dopt.dump_inserts_multiple < 0)
+	{
+		write_msg(NULL, "insert values must be positive number\n");
+		exit_nicely(1);
+	}
+}
+else
+	dopt.dump_inserts = 1;
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -607,7 +621,8 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
 		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
 
 	/* Identify archive format to emit */
@@ -877,6 +892,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -2052,6 +2068,193 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using multiple values INSERT commands.
+ */
+static int
+dumpTableData_insert_multiple(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer i = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+	  "SELECT * FROM ONLY %s",
+	  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+	appendPQExpBuffer(i, "FETCH %d FROM _pg_dump_cursor",
+	  dopt->dump_inserts_multiple);
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, i->data, PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+			if (ntuple  > 0)
+			{
+if (insertStmt == NULL)
+{
+			

RE: [suggestion]support UNICODE host variables in ECPG

2018-12-17 Thread Matsumura, Ryo
Nagaura-san

I understand that the previous discussion pointed that the feature had better
be implemented more simply or step-by-step and description about implementation
is needed more.
I also think it prevented the discussion to reach to the detail of feature.

What is your opinion about it?

Regards
Ryo Matsumura




Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Amit Langote
On 2018/12/17 18:10, Michael Paquier wrote:
> On Mon, Dec 17, 2018 at 05:56:08PM +0900, Amit Langote wrote:
>> You're saying that we should use plural "functions" because there of 2
>> *instances* of calling the function pg_partition_tree in the queries that
>> follow the comment, but I think that would be misleading.  I think the
>> plural would make sense if we're talking about two different functions,
>> but I may be wrong.
> 
> Or this could just use "Function calls"?

As far as the information content of this comment is concerned, I think
it'd be more useful to word this comment such that it is applicable to
different functions than to word it such that it is applicable to
different queries.  More opinions would be nice.

> My argument is just to not
> forget about updating this comment later on and minimize future noise
> diffs.

Okay, how about:

-- Various partitioning-related functions return NULL if passed relations
-- of types that cannot be part of a partition tree; for example, views,
-- materialized views, etc.

Thanks,
Amit




Re: ALTER INDEX ... ALTER COLUMN not present in dump

2018-12-17 Thread amul sul
On Mon, Dec 17, 2018 at 12:20 PM amul sul  wrote:

> On Mon, Dec 17, 2018 at 11:54 AM Michael Paquier 
> wrote:
> >
> > On Mon, Dec 17, 2018 at 12:24:15AM -0500, Tom Lane wrote:
> > > If we were to rename the "foo.expr" column at this point,
> > > and then dump and reload, the expression column in the
> > > second index would presumably acquire the name "expr"
> > > not "expr1", because "expr" would no longer be taken.
> > > So if pg_dump were to try to use that index column name
> > > in ALTER ... SET STATISTICS, it'd fail.
> >
> > Good point, thanks!  I did not think about the case where a table uses
> > an attribute name matching what would be generated for indexes.
> >
> > So this settles the argument that we had better not do anything before
> > v11.  Switching the dump code to use column numbers has not proved to be
> > complicated as only the query and some comments had to be tweaked.
> > Attached is an updated patch, and I am switching back the patch to
> > "Needs review" to have an extra pair of eyes look at that in case I
> > missed something.
>
> +1, will have a look, thanks.
>
>
I been through the patch -- looks good and does the expected job as
discussed.

make check and make check-world also fine.

Regards,
Amul


RE: [PROPOSAL]a new data type 'bytea' for ECPG

2018-12-17 Thread Matsumura, Ryo
Meskes-san

I noticed that I was confused.
My topic was about adding bytea as new host variable type.

The topic *didn't* include that receiving binary format data
into SQLDATA descriptor like the following.

  sqlda_t *sqlda;
  exec sql create table if not exists test (c1 bytea);
  exec sql select c1 into descriptor sqlda from test;
  /* It expects that sqlda->sqlvar[0].sqldata is binary format. */


So, please ignore the following in my mail at 2018-11-12 02:14:58.
> P.S.
> The patch does not support ECPG.bytea in sqltype of "struct sqlvar_struct"
> because of compatibility.

The topic included that receiving binary data into Named SQL descriptor with
using bytea host variable like the following. It has already been
implemented in my patch.

  exec sql begin declare section;
  bytea var[128];
  exec sql end declare section;
  exec sql create table if not exists test (c1 bytea);
  exec sql allocate descriptor odesc;
  exec sql select c1 into sql descriptor odesc from test;
  exec sql get descriptor odesc value 1 :var = data;

Regards
Ryo Matsumura


Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 05:56:08PM +0900, Amit Langote wrote:
> You're saying that we should use plural "functions" because there of 2
> *instances* of calling the function pg_partition_tree in the queries that
> follow the comment, but I think that would be misleading.  I think the
> plural would make sense if we're talking about two different functions,
> but I may be wrong.

Or this could just use "Function calls"?  My argument is just to not
forget about updating this comment later on and minimize future noise
diffs.
--
Michael


signature.asc
Description: PGP signature


Re: Remove double trailing semicolons

2018-12-17 Thread Amit Kapila
On Mon, Dec 17, 2018 at 1:00 PM Amit Kapila  wrote:
>
> On Mon, Dec 17, 2018 at 1:53 AM David Rowley
>  wrote:
> >
> > While looking over some recent commits, I noticed there are some code
> > lines with a double trailing semicolon instead of a single one.
>

Pushed, one of these goes back till 10.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Amit Langote
On 2018/12/17 17:25, Michael Paquier wrote:
> On Mon, Dec 17, 2018 at 04:41:01PM +0900, Amit Langote wrote:
>> Okay, let's use "Functions" then, although I wonder if you shouldn't tweak
>> it later when you commit the pg_partition_root patch?
> 
> There are already two calls to pg_partition_tree for each one of the two
> relkinds tested.

You're saying that we should use plural "functions" because there of 2
*instances* of calling the function pg_partition_tree in the queries that
follow the comment, but I think that would be misleading.  I think the
plural would make sense if we're talking about two different functions,
but I may be wrong.

Thanks,
Amit




Re: ToDo: show size of partitioned table

2018-12-17 Thread Pavel Stehule
Hi

pá 30. 11. 2018 v 20:06 odesílatel Pavel Stehule 
napsal:

>
>
> čt 29. 11. 2018 v 7:58 odesílatel Michael Paquier 
> napsal:
>
>> On Thu, Nov 22, 2018 at 03:47:11PM +0100, Pavel Stehule wrote:
>> > I have not feel well when I see in one report numbers 40 and 16, I see
>> much
>> > more comfortable when I see 24 and 16, but for this I need a different
>> > perspective
>> >
>> > What do you think about it?
>>
>> Maybe, my thought is that this would be a separate patch, and that we
>> had better keep it simple for now because that's complicated enough :)
>>
>
> any time, the behave should be consistent.
>
> I agree, so there can be another patch, that reports more about
> partitioned table for \d+
>
>
>> So I would rather have a wrapper on top of pg_partition_tree which is
>> useful for the user with matching patterns, which shows roughly:
>> - the size of the whole partition tree from the root.
>> - its direct parent if any.  Potentially in the verbose output.  This
>> depends on how much users have multi-level partitions.  My bet would be
>> not that much, but more input from others is welcome.
>> - perhaps its level in the hierarchy, if integrated most likely as part
>> of the verbose output.
>>
>> Then with the set of commands, have a mapping close to how \d treats
>> indexes and relations:
>> - \dp shows information about partitioned tables or indexes, if no pattern
>> is defined tables show up.  If a partitioned index pattern shows up,
>> then index information is displayed.
>>
>
> ok
>
> - \dpi and \dpt for respectively partitioned indexes and tables.
>>
>
new update of this patch

changes:

1. only root partitioned tables are displayed - you can see quickly total
allocated space. It is not duplicated due nested partitions.

I can imagine new additional flag - line "n" nested - and then we can
display nested partitioned tables with parent table info. Some like

\dPt - show only root partition tables
\dPnt or \dPtn - show root and nested partitioned tables

2. \dP without pattern shows root partitioned tables + total relation size.
When pattern is defined, then shows tables and indexes + table size

postgres=# \dP+
List of partitioned relations
┌┬┬───┬┬─┐
│ Schema │Name│ Owner │  Size  │ Description │
╞╪╪═══╪╪═╡
│ public │ parent_tab │ pavel │ 120 kB │ │
└┴┴───┴┴─┘
(1 row)

postgres=# \dP+ *
List of partitioned relations or indexes
┌┬──┬───┬───┬┬───┬─┐
│ Schema │ Name │ Owner │   Type│   Table│ Size  │
Description │
╞╪══╪═══╪═══╪╪═══╪═╡
│ public │ parent_index │ pavel │ partitioned index │ parent_tab │ 80 kB
│ │
│ public │ parent_tab   │ pavel │ partitioned table ││ 40 kB
│ │
└┴──┴───┴───┴┴───┴─┘
(2 rows)

postgres=# \dP+ *index
List of partitioned relations or indexes
┌┬──┬───┬───┬┬───┬─┐
│ Schema │ Name │ Owner │   Type│   Table│ Size  │
Description │
╞╪══╪═══╪═══╪╪═══╪═╡
│ public │ parent_index │ pavel │ partitioned index │ parent_tab │ 80 kB
│ │
└┴──┴───┴───┴┴───┴─┘
(1 row)

Regards

Pavel


> --
>> Michael
>>
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6c76cf2f00..626c89d055 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1645,6 +1645,53 @@ testdb=
 
   
 
+
+  
+\dP[+] [ pattern ]
+
+
+Lists partitioned relations. If pattern is specified, only
+entries whose relation name or schema name matches
+the pattern are listed. If the form \dP+
+is used, the sum of size of related partitions (including the
+table and indexes, if any) and a description
+are also displayed.
+
+
+  
+
+
+  
+\dPi[+] [ pattern ]
+
+
+Lists partitioned indexes. If pattern is specified, only
+entries whose index name or schema name matches
+the pattern are listed. If the form \dPi+
+is used, the sum of size of related indexes and a description
+are also displayed.
+
+
+  
+
+
+  
+\dPt[+] [ pattern ]
+
+
+Lists partitioned tables. If pattern is specified, only
+entries whose table name or schema name matches
+the pattern are listed. If the form \dPt+
+is used, the sum of size of related tables 

[suggestion]support UNICODE host variables in ECPG

2018-12-17 Thread Nagaura, Ryohei
Hi all.

There is a demand for ECPG to support UNICODE host variables.
This topic has also appeared in thread [1].
I would like to discuss whether to support in postgres.

Do you have any opinion?

The specifications and usage described below are the same as in [1].

Requirements

1. support utext keyword in ECPG



The utext is used to define the Unicode host variable in ECPG application
in windows platform.



2. support  UVARCHAR keyword in ECPG



The UVARCHAR is used to define the Unicode vary length host variable in
ECPG application in windows platform.



3. Saving the content of the Unicode variables into the database as
database character set or getting the content from the database into the
Unicode variables.



4. Firstly can only consider the UTF8 as database character set and UTF16
as the Unicode format for host variable. A datatype convert will be done
between the UTF8 and UTF16 by ECPG.



5. Since Unicode has big-endian and little-endian format, a environment
variable is used to identify them and do the endianness convert accordingly.



Usage

int main() {
EXEC SQL BEGIN DECLARE SECTION;
utext employee[20] ;/* define Unicode host variable  */
UVARCHAR address[50] ;  /* defin a vary length Unicode host
variable  */
EXEC SQL END DECLARE SECTION;



...



EXEC SQL CREATE TABLE emp (ename char(20), address varchar(50));



/* UTF8 is the database character set  */
EXEC SQL INSERT INTO emp (ename) VALUES ('Mike', '1 sydney, NSW') ;



/* Database character set converted to Unicode */
EXEC SQL SELECT ename INTO :employee FROM emp ;



/* Database character set converted to Unicode */
EXEC SQL SELECT address INTO :address FROM emp ;



wprintf(L"employee name is %s\n",employee);



wprintf(L"employee address is %s\n", address.attr);



DELETE * FROM emp;



/* Unicode converted to Database character */
EXEC SQL INSERT INTO emp (ename,address) VALUES (:employee, :address);



EXEC SQL DROP TABLE emp;
EXEC SQL DISCONNECT ALL;
 }

[1]
https://www.postgresql.org/message-id/flat/CAF3%2BxMLcare1QrDzTxP-3JZyH5SXRkGzNUf-khSgPfmpQpkz%2BA%40mail.gmail.com

Best regards,
-
Ryohei Nagaura





Re: ALTER INDEX ... ALTER COLUMN not present in dump

2018-12-17 Thread amul sul
On Mon, Dec 17, 2018 at 1:57 PM Michael Paquier  wrote:

> On Mon, Dec 17, 2018 at 01:19:00PM +0530, amul sul wrote:
> > Laptop215:PG edb$ git --version
> > git version 2.14.1
>
> Using 2.20, git apply works fine for me but...  If patch -p1 works, why
> not just using it then?  It is always possible to check the patch for
> whitespaces or such with "git diff --check" anyway.
>
> Agree, will adopt the same practice in future, thank you.

Regards,
Amul


Re: ALTER INDEX ... ALTER COLUMN not present in dump

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 01:19:00PM +0530, amul sul wrote:
> Laptop215:PG edb$ git --version
> git version 2.14.1

Using 2.20, git apply works fine for me but...  If patch -p1 works, why
not just using it then?  It is always possible to check the patch for
whitespaces or such with "git diff --check" anyway.
--
Michael


signature.asc
Description: PGP signature


Re: Fixing typos in tests of partition_info.sql

2018-12-17 Thread Michael Paquier
On Mon, Dec 17, 2018 at 04:41:01PM +0900, Amit Langote wrote:
> Okay, let's use "Functions" then, although I wonder if you shouldn't tweak
> it later when you commit the pg_partition_root patch?

There are already two calls to pg_partition_tree for each one of the two
relkinds tested.
--
Michael


signature.asc
Description: PGP signature