Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2017-09-14 Thread Craig Ringer
On 14 September 2017 at 19:57, Ashutosh Sharma 
wrote:


>
> Are you planning to work on the review comments from Robert, Moon
> Insung and supply the new patch. I just had a quick glance into this
> mail thread (after a long time) and could understand Robert's concern
> till some extent. I think, he is trying to say that if a tuple is
> frozen (committed|invalid) then it shouldn't be shown as COMMITTED and
> INVALID together in fact it should just be displayed as FROZEN tuple.


Yes, I'd like to, and should have time for it in this CF.

My plan is to emit raw flags by default, so FROZEN would't be shown at all,
only COMMITTED|INVALID. If the bool to decode combined flags is set, then
it'll show things like FROZEN, and hide COMMITTED|INVALID. Similar for
other combos.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-14 Thread Alexander Korotkov
On Thu, Sep 14, 2017 at 2:20 PM, Dmitriy Sarafannikov <
dsarafanni...@yandex.ru> wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:tested, passed
>
> This is simple and intuitive patch. Code looks pretty clear and well
> documented.
>
> I think it is good idea to decrease overhead on calling fake
> compressing/decomressing
> functions. This eliminates the need to implement the fetch function in
> such cases.
>
> I also tried to disable compress and decompress functions in contrib/cube:
> diff --git a/contrib/cube/cube--1.2.sql b/contrib/cube/cube--1.2.sql
> index dea2614888..26301b71d7 100644
> --- a/contrib/cube/cube--1.2.sql
> +++ b/contrib/cube/cube--1.2.sql
> @@ -370,8 +370,6 @@ CREATE OPERATOR CLASS gist_cube_ops
>
> FUNCTION1   g_cube_consistent (internal, cube,
> smallint, oid, internal),
> FUNCTION2   g_cube_union (internal, internal),
> -   FUNCTION3   g_cube_compress (internal),
> -   FUNCTION4   g_cube_decompress (internal),
> FUNCTION5   g_cube_penalty (internal, internal,
> internal),
> FUNCTION6   g_cube_picksplit (internal, internal),
> FUNCTION7   g_cube_same (cube, cube, internal),
>
> And it is enables IndexOnlyScan, this is expected behaviour.
> + -- test explain
> + set enable_seqscan to 'off';
> + set enable_bitmapscan to 'off';
> + select count(*) from test_cube where c && '(3000,1000),(0,0)';
> +  count
> + ---
> +  5
> + (1 row)
> +
> + explain analyze select c from test_cube where c && '(3000,1000),(0,0)';
> +QUERY PLAN
> + 
> 
> +  Index Only Scan using test_cube_ix on test_cube  (cost=0.15..56.43
> rows=16 width=32) (actual time=0.015..0.018 rows=5 loops=1)
> +Index Cond: (c && '(3000, 1000),(0, 0)'::cube)
> +Heap Fetches: 5
> +  Planning time: 0.038 ms
> +  Execution time: 0.032 ms
> + (5 rows)
>
> The new status of this patch is: Ready for Committer


It would be good if someone would write patch for removing useless
compress/decompress methods from builtin and contrib GiST opclasses.
Especially when it gives benefit in IndexOnlyScan enabling.

BTW, this change in the patch look suspicious for me.

> @@ -913,7 +931,6 @@ gistproperty(Oid index_oid, int attno,
>   ReleaseSysCache(tuple);
>
>   /* Now look up the opclass family and input datatype. */
> -
>   tuple = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
>   if (!HeapTupleIsValid(tuple))
>   {

We are typically evade changing formatting in fragments of codes not
directly touched by the patch.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-14 Thread Alvaro Herrera
Tom Lane wrote:
> "David G. Johnston"  writes:
> >​If I was going to try and read it like a book I'd want the extra
> > white-space to make doing so easier (white-space gives the eye a breather
> > when done with a particular concept) - and the length wouldn't really
> > matter since I'd just make a single pass and be done with it.  But the
> > planned usage is for quick lookup of options that you know (or at least
> > suspect) exist and which you probably have an approximate idea of how they
> > are spelled.  The all-caps and left-justified block headers are distinct
> > enough to scan down - though I'd consider indenting 4 spaces instead of 2
> > to make that even easier (less effort to ignore the indented lines since
> > ignoring nothing is easier than ignoring something).​  Having more fit on
> > one screen makes that vertical skimming considerably easier as well (no
> > break and re-acquire when scrolling in a new page).
> 
> Hmm, indenting the descriptions a couple more spaces might be a workable
> compromise.  Anyone want to try that and see what it looks like?
> Preferably somebody who's not happy with the current layout ;-)

I have to admit that adding two spaces makes it look a lot more
acceptable to me.

(I'd tweak the description of PSQL_EDITOR_LINENUMBER_ARG by replacing
"how to" with "mechanism to" while at it, by the way.  It took me a
while to understand what it was and I first thought the description was
completely bogus.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list

2017-09-14 Thread Pavel Stehule
Hi

2017-09-14 12:33 GMT+02:00 Anthony Bykov :

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   not tested
> Documentation:tested, failed
>
> Hello,
> As far as I understand, this patch adds functionality (correct me if I'm
> wrong) for users. Shouldn't there be any changes in doc/src/sgml/ with the
> description of new functionality?
>

It removes undocumented limit. I recheck plpgsql documentation and there
are not any rows about prohibited combinations of data types.

There is line:

where command-string is an expression yielding a string (of type text)
containing the command to be executed. The optional target is a record
variable, a row variable, or a comma-separated list of simple variables and
record/row fields, into which the results of the command will be stored.
The optional USING expressions supply values to be inserted into the
command.

what is correct if I understand well with this patch.

Regards

Pavel


>
> Regards
> Anthony
>
> The new status of this patch is: Waiting on Author
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

2017-09-14 Thread Jesper Pedersen

On 09/11/2017 11:01 AM, Jesper Pedersen wrote:

Thanks for working on this !



Moved to "Ready for Committer".

Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] no test coverage for ALTER FOREIGN DATA WRAPPER name HANDLER ...

2017-09-14 Thread Ashutosh Bapat
On Wed, Sep 13, 2017 at 1:46 PM, Amit Langote
 wrote:
>>
>> Ok. May be then create_function_1.sql is the right place. Just add it
>> to the section of passing tests and annotate that it's testing
>> creating an FDW handler. Sorry for jumping back and forth.
>
> Alright, done.  Thanks for reviewing.
>

LGTM. The patch applies cleanly on the current HEAD, compiles (small
bit in regress.c requires compilation), and make check passes. Marking
this as "ready for committer".

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-09-14 Thread Ashutosh Bapat
On Wed, Sep 13, 2017 at 6:58 AM, Thomas Munro
 wrote:
> On Tue, Sep 12, 2017 at 11:23 PM, Ashutosh Bapat
>  wrote:
>> On Wed, Aug 16, 2017 at 11:13 AM, Ashutosh Bapat
>>  wrote:
>>> On Wed, Aug 16, 2017 at 8:44 AM, Alvaro Herrera
>>>  wrote:
 Christoph Berg wrote:
> "Diagnostic message" doesn't really mean anything, and printing
> "DETAIL: Diagnostic message: " seems redundant to me. Maybe
> drop that prefix? It should be clear from the context that this is a
> message from the LDAP layer.

 I think making it visible that the message comes from LDAP (rather than
 Postgres or anything else) is valuable.  How about this?

 LOG:  could not start LDAP TLS session: Protocol error
 DETAIL:  LDAP diagnostics: unsupported extended operation.

>>> +1, pretty neat.
>
> Here is a new version adopting Alvaro's wording.  I'll set this back
> to "Needs review" status.
>

Thanks for the updated patches.

Looks good to me. The patch applies cleanly on the latest HEAD,
compiles without any errors or warnings and make check passes. Marking
this as ready for committer.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-14 Thread Alvaro Herrera
Peter Eisentraut wrote:
> On 9/13/17 09:56, Alvaro Herrera wrote:
> > Tom Lane wrote:
> >> Peter Eisentraut  writes:
> > 
> >>> - Disallow DROP SUBSCRIPTION in a transaction under certain
> >>> circumstances, for example if a transaction has previously manipulated
> >>> the same subscription.
> > 
> >> ISTM the second of those (refuse to drop an in-use subscription) is
> >> by far the least surprising behavior.
> > 
> > +1 for that option.  IIRC this has precedent for other object types such
> > as tables, where we refuse some action if we have already operated on
> > the table in the same transaction.
> 
> What are some examples of such behavior?

Search for CheckTableNotInUse() callers.

/me hates that the error messages are split across lines

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2017-09-14 Thread Ashutosh Sharma
Hi Craig,

On Thu, Aug 17, 2017 at 5:50 AM, Craig Ringer  wrote:
> On 16 August 2017 at 23:14, Robert Haas  wrote:
>>
>> On Tue, Aug 15, 2017 at 4:36 PM, Tomas Vondra
>>  wrote:
>> > You might say that people investigating issues in this area of code
>> > should
>> > be aware of how HEAP_XMIN_FROZEN is defined, and perhaps you're right
>> > ...
>>
>> Yes, I think that's what I would say.  I mean, if you happen to NOT
>> know that committed|invalid == frozen, but you DO know what committed
>> means and what invalid means, then you're going to be *really*
>> confused when you see committed and invalid set on the same tuple.
>> Showing you frozen has got to be clearer.
>>
>> Now, I agree with you that a test like (enumval_tup->t_data &
>> HEAP_XMIN_COMMITTED) could be confusing to someone who doesn't realize
>> that HEAP_XMIN_FROZEN & HEAP_XMIN_COMMITTED == HEAP_XMIN_COMMITTED,
>> but I think that's just one of those things that unfortunately is
>> going to require adequate knowledge for people investigating issues.
>> If there's an action item there, it might be to try to come up with a
>> way to make the source code clearer.
>>
>
> For other multi-purpose flags we have macros, and I think it'd make sense to
> use them here too.
>
> Eschew direct use of  HEAP_XMIN_COMMITTED, HEAP_XMIN_INVALID and
> HEAP_XMIN_FROZEN in tests. Instead, consistently use HeapXminIsFrozen(),
> HeapXminIsCommitted(), and HeapXminIsInvalid() or something like that.
>
> --

Are you planning to work on the review comments from Robert, Moon
Insung and supply the new patch. I just had a quick glance into this
mail thread (after a long time) and could understand Robert's concern
till some extent. I think, he is trying to say that if a tuple is
frozen (committed|invalid) then it shouldn't be shown as COMMITTED and
INVALID together in fact it should just be displayed as FROZEN tuple.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-09-14 Thread Amit Khandekar
On 14 September 2017 at 06:43, Amit Langote
> langote_amit...@lab.ntt.co.jp> wrote:
> Attached updated patch.

@@ -1222,151 +1209,130 @@ PartitionDispatch *
 RelationGetPartitionDispatchInfo(Relation rel,
 int
*num_parted, List **leaf_part_oids)
 {
+   List   *pdlist;
PartitionDispatchData **pd;

+   get_partition_dispatch_recurse(rel, NULL, , leaf_part_oids);

Above, pdlist is passed uninitialized. And then inside
get_partition_dispatch_recurse(), it is used here :
*pds = lappend(*pds, pd);



pg_indent says more alignments needed. For e.g. gettext_noop() call
below needs to be aligned:
pd->tupmap = convert_tuples_by_name(RelationGetDescr(parent),
tupdesc,
gettext_noop("could not convert row type"));



Other than that, the patch looks good to me. I verified that the leaf
oids are ordered exaclty in the order of the UPDATE subplans output.


-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-14 Thread Dmitriy Sarafannikov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

This is simple and intuitive patch. Code looks pretty clear and well documented.

I think it is good idea to decrease overhead on calling fake 
compressing/decomressing
functions. This eliminates the need to implement the fetch function in such 
cases.

I also tried to disable compress and decompress functions in contrib/cube:
diff --git a/contrib/cube/cube--1.2.sql b/contrib/cube/cube--1.2.sql
index dea2614888..26301b71d7 100644
--- a/contrib/cube/cube--1.2.sql
+++ b/contrib/cube/cube--1.2.sql
@@ -370,8 +370,6 @@ CREATE OPERATOR CLASS gist_cube_ops

FUNCTION1   g_cube_consistent (internal, cube, smallint, 
oid, internal),
FUNCTION2   g_cube_union (internal, internal),
-   FUNCTION3   g_cube_compress (internal),
-   FUNCTION4   g_cube_decompress (internal),
FUNCTION5   g_cube_penalty (internal, internal, internal),
FUNCTION6   g_cube_picksplit (internal, internal),
FUNCTION7   g_cube_same (cube, cube, internal),

And it is enables IndexOnlyScan, this is expected behaviour.
+ -- test explain
+ set enable_seqscan to 'off';
+ set enable_bitmapscan to 'off';
+ select count(*) from test_cube where c && '(3000,1000),(0,0)';
+  count
+ ---
+  5
+ (1 row)
+
+ explain analyze select c from test_cube where c && '(3000,1000),(0,0)';
+QUERY PLAN
+ 

+  Index Only Scan using test_cube_ix on test_cube  (cost=0.15..56.43 rows=16 
width=32) (actual time=0.015..0.018 rows=5 loops=1)
+Index Cond: (c && '(3000, 1000),(0, 0)'::cube)
+Heap Fetches: 5
+  Planning time: 0.038 ms
+  Execution time: 0.032 ms
+ (5 rows)

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-14 Thread Michael Paquier
On Thu, Sep 14, 2017 at 6:36 PM, Amit Kapila  wrote:
> Why do we need to change metapage at every place for btree ...

I have been hunting for some time places where meta buffers were
marked as dirtied and logged. So in the effort, I think that my hands
and mind got hotter, forgetting that pd_lower is set there for ages.
Of course feel free to ignore that.

> ... or hash?
> Any index that is upgraded should have pd_lower set, do you have any
> case in mind where it won't be set?  For hash, if someone upgrades
> from a version lower than 9.6, it might not have set, but we already
> give warning to reindex the hash indexes upgraded from a version lower
> than 10.

Ah yes. You do set pd_lower in 10 as well for hash... So that will be
fine. So remains SpGist as a slacking AM based on the current patches.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list

2017-09-14 Thread Anthony Bykov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, failed

Hello,
As far as I understand, this patch adds functionality (correct me if I'm wrong) 
for users. Shouldn't there be any changes in doc/src/sgml/ with the description 
of new functionality?

Regards
Anthony

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surjective functional indexes

2017-09-14 Thread Simon Riggs
On 14 September 2017 at 10:42, Konstantin Knizhnik
 wrote:
>
>
> On 13.09.2017 14:00, Simon Riggs wrote:
>>
>> On 13 September 2017 at 11:30, Konstantin Knizhnik
>>  wrote:
>>
>>> The only reason of all this discussion about terms is that I need to
>>> choose
>>> name for correspondent index option.
>>> Simon think that we do not need this option at all. In this case we
>>> should
>>> not worry about right term.
>>>  From my point of view, "projection" is quite clear notion and not only
>>> for
>>> mathematics. It is also widely used in IT and especially in DBMSes.
>>
>> If we do have an option it won't be using fancy mathematical
>> terminology at all, it would be described in terms of its function,
>> e.g. recheck_on_update
>>
>> Yes, I'd rather not have an option at all, just some simple code with
>> useful effect, like we have in many other places.
>>
> Attached please find new version of projection functional index optimization
> patch.
> I have implemented very simple autotune strategy: now I use table statistic
> to compare total number of updates with number of hot updates.
> If fraction of hot updates is relatively small, then there is no sense to
> spend time performing extra evaluation of index expression and comparing its
> old and new values.
> Right now the formula is the following:
>
> #define MIN_UPDATES_THRESHOLD 10
> #define HOT_RATIO_THRESHOLD   2
>
> if (stat->tuples_updated > MIN_UPDATES_THRESHOLD
> && stat->tuples_updated >
> stat->tuples_hot_updated*HOT_RATIO_THRESHOLD)
> {
> /* If percent of hot updates is small, then disable projection
> index function
>  * optimization to eliminate overhead of extra index expression
> evaluations.
>  */
> ii->ii_Projection = false;
> }
>
> This threshold values are pulled out of a hat: I am not sure if this
> heuristic is right.
> I will be please to get feedback if such approach to autotune is promising.

Hmm, not really, but thanks for trying.

This works by looking at overall stats, and only looks at the overall
HOT %, so its too heavyweight and coarse.

I suggested storing stat info on the relcache and was expecting you
would look at how often the expression evaluates to new == old. If we
evaluate new against old many times, then if the success rate is low
we should stop attempting the comparison. (<10%?)

Another idea:
If we don't make a check when we should have done then we will get a
non-HOT update, so we waste time extra time difference between a HOT
and non-HOT update. If we check and fail we waste time take to perform
check. So the question is how expensive the check is against how
expensive a non-HOT update is. Could we simply say we don't bother to
check functions that have a cost higher than 1? So if the user
doesn't want to perform the check they can just increase the cost of
the function above the check threshold?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader range of object drops.

2017-09-14 Thread Alvaro Herrera
Hadi Moshayedi wrote:
> To provide more context, in cstore_fdw creating the storage is easy, we
> only need to hook into CREATE FOREIGN TABLE using event triggers. Removing
> the storage is not that easy, for DROP FOREIGN TABLE we can use event
> triggers.

This all sounds a little more complicated than it should ... but since
FDW are not really IMO the right interface to be implementing a
different storage format, I'm not terribly on board with supporting this
more completely.  So what you're doing now is probably acceptable.

> But when we do DROP EXTENSION, the event triggers don't get fired
> (because they have already been dropped),

This however sounds like a silly design bug ... surely the event
triggers should have been fired when the table is dropped, before
dropping the event triggers themselves.  I can't offhand think of any
good way to fix that, however.

> so to handle DROP EXTENSION, we need to hook into the process utility
> hook. Now to implement this, (1) we get a list of all cstore tables
> (2) call postgres's utility hook, (3) if #2 succeeds clean-up all
> cstore table storage's. But when #3 happens the relation isn't there
> anymore, so we create a pseudo-relation [1] and call
> RelationDropStorage().

This sounds terrible.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT

2017-09-14 Thread Kyotaro HORIGUCHI
At Wed, 13 Sep 2017 20:20:48 -0400, Robert Haas  wrote 
in 
> On Mon, Jun 26, 2017 at 4:46 AM, Kyotaro HORIGUCHI
>  wrote:
> > The cause is that NO INHERIT doesn't take an exlusive lock on the
> > parent. This allows expand_inherited_rtentry to add the child
> > relation into appendrel after removal from the inheritance but
> > still exists.
> >
> > I see two ways to fix this.
> >
> > The first patch adds a recheck of inheritance relationship if the
> > corresponding attribute is missing in the child in
> > make_inh_translation_list(). The recheck is a bit complex but it
> > is not performed unless the sequence above is happen. It checks
> > duplication of relid (or cycles in inheritance) following
> > find_all_inheritors (but doing a bit different) but I'm not sure
> > it is really useful.
> >
> >
> > The second patch lets ALTER TABLE NO INHERIT to acquire locks on
> > the parent first.
> >
> > Since the latter has a larger impact on the current behavior and
> > we already treat "DROP TABLE child" case in the similar way, I
> > suppose that the first approach would be preferable.
> >
> >
> > Any comments or thoughts?
> 
> I agree that the second has less user impact, but I wonder if we can
> think that will really fix the bug completely, or more generally if it
> will fix all of the bugs that come from ALTER TABLE .. NO INHERIT not
> locking the parent.  I have a sneaking suspicion that may be wishful
> thinking.

Thanks for the comment.

The recheck patch prevent planner from involving just-detached
children while inheritance expansion. No simultaneous detatching
of children doesn't affect the planning before the time.

Once planner (the select side) gets lock on the child, the alter
side cannot do anything until the select finishes. If the alter
side won, the select side detects detaching immediately after the
lock is released then excludes the children. No problem will
occur ever after. Even in the case a child is replaced with
another table, it is nothing different from simple detaching.

As the result, I think that the recheck patch saves all possible
problem caused by simultaneously detached children.

However, the parent-locking patch is far smaller and it doesn't
need such an explanation on its perfectness. If another problem
occurs by simlultaneous detaching, it must haven't taken
necessary locks in the right order.

The most significant reason for my decision on this ptach was the
fact that the DROP child case have been resolved by rechecking
without parent locks, which is a kind of waste of resources if it
could be resolved without it. (And I saw that the same solution
is taken at least several places)

I don't think there's noticeable difference of behavior
(excluding performance) for users between the two solutions.

Is there anyone who has an opinion on the point?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings "unrecognized node type" for some DDLs with log_statement = 'ddl'

2017-09-14 Thread Ashutosh Sharma
On Thu, Sep 14, 2017 at 9:24 AM, Michael Paquier
 wrote:
> Hi all,
>
> While reviewing another patch, I have bumped into a couple of failures
> when running installcheck if log_statement = 'ddl'. This pops
> regression failures for 4 tests: object_address, alter_generic,
> alter_operator and stats_ext involving commands CREATE STATISTICS and
> ALTER OPERATOR.
>
> You can as well reproduce the failures using simply that:
> =# create table aa (a int, b int);
> CREATE TABLE
> =# CREATE STATISTICS aa_stat ON a, b FROM aa;
> WARNING:  01000: unrecognized node type: 332
> LOCATION:  GetCommandLogLevel, utility.c:3357
> ERROR:  42P17: extended statistics require at least 2 columns
> LOCATION:  CreateStatistics, statscmds.c:220
> =# ALTER OPERATOR = (boolean, boolean) SET (RESTRICT = NONE);
> WARNING:  01000: unrecognized node type: 294
> LOCATION:  GetCommandLogLevel, utility.c:3357
> ALTER OPERATOR
>
> Attached is a patch to fix all the failures I have spotted. As CREATE
> STATISTICS is new in PG10, I am adding an open item as things come
> from 7b504eb2. The problems of ALTER OPERATOR are introduced by 9.6.
> Still I would suggest to fix everything at the same time. The problem
> comes from GetCommandLogLevel() which forgot to add T_CreateStatsStmt
> and T_AlterOperatorStmt. I have noticed as well that
> T_AlterCollationStmt was missing.
>
Hmm...I am also able to reproduce the failures reported by you. Your
patch does solve the problem. Just to confirm if we are still missing
TAGS for any other Statement nodes, I had a quick look into the list
given in 'nodes.h' file but couldn't find any missing nodes. So, yes,
your patch does solve the problem completely and looks good to me.
Thanks.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surjective functional indexes

2017-09-14 Thread Konstantin Knizhnik



On 13.09.2017 14:00, Simon Riggs wrote:

On 13 September 2017 at 11:30, Konstantin Knizhnik
 wrote:


The only reason of all this discussion about terms is that I need to choose
name for correspondent index option.
Simon think that we do not need this option at all. In this case we should
not worry about right term.
 From my point of view, "projection" is quite clear notion and not only for
mathematics. It is also widely used in IT and especially in DBMSes.

If we do have an option it won't be using fancy mathematical
terminology at all, it would be described in terms of its function,
e.g. recheck_on_update

Yes, I'd rather not have an option at all, just some simple code with
useful effect, like we have in many other places.

Attached please find new version of projection functional index 
optimization patch.
I have implemented very simple autotune strategy: now I use table 
statistic to compare total number of updates with number of hot updates.
If fraction of hot updates is relatively small, then there is no sense 
to spend time performing extra evaluation of index expression and 
comparing its old and new values.

Right now the formula is the following:

#define MIN_UPDATES_THRESHOLD 10
#define HOT_RATIO_THRESHOLD   2

if (stat->tuples_updated > MIN_UPDATES_THRESHOLD
&& stat->tuples_updated > 
stat->tuples_hot_updated*HOT_RATIO_THRESHOLD)

{
/* If percent of hot updates is small, then disable 
projection index function
 * optimization to eliminate overhead of extra index 
expression evaluations.

 */
ii->ii_Projection = false;
}

This threshold values are pulled out of a hat: I am not sure if this 
heuristic is right.

I will be please to get feedback if such approach to autotune is promising.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..52189ac 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo-'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect
+   the function value is small, then marking index as projection may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762..b73165f 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"projection",
+			"Evaluate functional index expression on update to check if its values is changed",
+			RELOPT_KIND_INDEX,
+			AccessExclusiveLock
+		},
+		true
+	},
+	{
+		{
 			"security_barrier",
 			"View acts as a row security barrier",
 			RELOPT_KIND_VIEW,
@@ -1301,7 +1310,7 @@ fillRelOptions(void *rdopts, Size basesize,
 break;
 			}
 		}
-		if (validate && !found)
+		if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX)
 			elog(ERROR, "reloption \"%s\" not found in parse table",
  options[i].gen->name);
 	}
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e29c5ad..05e372f 100644
--- a/src/backend/access/heap/heapam.c
+++ 

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-14 Thread Amit Kapila
On Thu, Sep 14, 2017 at 12:30 PM, Michael Paquier
 wrote:
> On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote
>  wrote:
>> Sure, no problem.
>

>
> +*
> +* This won't be of any help unless we use option like REGBUF_STANDARD
> +* while registering the buffer for metapage as otherwise, it won't try to
> +* remove the hole while logging the full page image.
>  */
> This comment is in the btree code. But you actually add
> REGBUF_STANDARD. So I think that this could be just removed.
>
>  * Set pd_lower just past the end of the metadata.  This is not essential
> -* but it makes the page look compressible to xlog.c.
> +* but it makes the page look compressible to xlog.c.  See
> +* _bt_initmetapage.
> This reference could be removed as well as _bt_initmetapage does not
> provide any information, the existing comment is wrong without your
> patch, and then becomes right with this patch.
>

I have added this comment just to add some explanation as to why we
are setting pd_lower and what makes it useful.  We can change it or
remove it, but I am not sure what is the right thing to do here, may
be we can defer this to the committer.

>
> It seems to me that an update of pd_lower is missing in _bt_getroot(),
> just before marking the buffer as dirty I think. And there is a second
> in _bt_unlink_halfdead_page(), a third in _bt_insertonpg() and finally
> one in _bt_newroot().
>
>
> For hash, hashbulkdelete(), _hash_vacuum_one_page(),
> _hash_addovflpage(), _hash_freeovflpage() and _hash_doinsert() are
> missing the shot, no? We could have a meta page of a hash index
> upgraded from PG10.
>

Why do we need to change metapage at every place for btree or hash?
Any index that is upgraded should have pd_lower set, do you have any
case in mind where it won't be set?  For hash, if someone upgrades
from a version lower than 9.6, it might not have set, but we already
give warning to reindex the hash indexes upgraded from a version lower
than 10.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module

2017-09-14 Thread Alexey Chernyshov
On Tue, 12 Sep 2017 12:59:20 -0400
Tom Lane  wrote:

> Quick comment on this patch: recently, we've decided that having
> patches replace the whole base script for an extension is too much of
> a maintenance problem, especially when there are several patches in
> the pipeline for the same contrib module.  The new style is to
> provide only a version update script (which you'd have to write
> anyway), and then rely on CREATE EXTENSION to apply the old base
> script plus the update(s). You can see some examples in the patch I
> just posted at
> 
> https://www.postgresql.org/message-id/24721.1505229...@sss.pgh.pa.us
> 
> Also, since that patch is probably going to get committed pretty
> soon, you could reformulate your patch as an add-on to its
> citext--1.4--1.5.sql script.  We don't really need to have a separate
> version of the extension for states that are intermediate between two
> PG major releases.  Only if your patch doesn't get in by v11 freeze
> would you need to make it a separate citext--1.5--1.6.sql script.
> 
>   regards, tom lane

Accented characters and different length strings tests added.
Since patch
(ttps://www.postgresql.org/message-id/24721.1505229...@sss.pgh.pa.us)
is committed, I changed the patch as you said. Thanks for your notes.
Do we need expected/citext.out? It seems that only
expected/citext_1.out has correct output.

-- 
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom 35aa8d7a1890a242cdfb3bed6cabaa3b39766f1f Mon Sep 17 00:00:00 2001
From: Alexey Chernyshov 
Date: Tue, 18 Jul 2017 13:50:19 +0300
Subject: [PATCH] patch applied

---
 contrib/citext/citext--1.4--1.5.sql  |  74 +++
 contrib/citext/citext.c  | 120 
 contrib/citext/expected/citext.out   | 370 +++
 contrib/citext/expected/citext_1.out | 370 +++
 contrib/citext/sql/citext.sql|  78 
 5 files changed, 1012 insertions(+)

diff --git a/contrib/citext/citext--1.4--1.5.sql b/contrib/citext/citext--1.4--1.5.sql
index 97942cb..5ae522b 100644
--- a/contrib/citext/citext--1.4--1.5.sql
+++ b/contrib/citext/citext--1.4--1.5.sql
@@ -12,3 +12,77 @@ ALTER OPERATOR >= (citext, citext) SET (
 RESTRICT   = scalargesel,
 JOIN   = scalargejoinsel
 );
+
+CREATE FUNCTION citext_pattern_lt( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION citext_pattern_le( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION citext_pattern_gt( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION citext_pattern_ge( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE OPERATOR ~<~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~>=~,
+COMMUTATOR = ~>~,
+PROCEDURE  = citext_pattern_lt,
+RESTRICT   = scalarltsel,
+JOIN   = scalarltjoinsel
+);
+
+CREATE OPERATOR ~<=~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~>~,
+COMMUTATOR = ~>=~,
+PROCEDURE  = citext_pattern_le,
+RESTRICT   = scalarltsel,
+JOIN   = scalarltjoinsel
+);
+
+CREATE OPERATOR ~>=~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~<~,
+COMMUTATOR = ~<=~,
+PROCEDURE  = citext_pattern_ge,
+RESTRICT   = scalargtsel,
+JOIN   = scalargtjoinsel
+);
+
+CREATE OPERATOR ~>~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~<=~,
+COMMUTATOR = ~<~,
+PROCEDURE  = citext_pattern_gt,
+RESTRICT   = scalargtsel,
+JOIN   = scalargtjoinsel
+);
+
+CREATE FUNCTION citext_pattern_cmp(citext, citext)
+RETURNS int4
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT IMMUTABLE PARALLEL SAFE;
+
+CREATE OPERATOR CLASS citext_pattern_ops
+FOR TYPE CITEXT USING btree AS
+OPERATOR1   ~<~  (citext, citext),
+OPERATOR2   ~<=~ (citext, citext),
+OPERATOR3   =(citext, citext),
+OPERATOR4   ~>=~ (citext, citext),
+OPERATOR5   ~>~  (citext, citext),
+FUNCTION1   citext_pattern_cmp(citext, citext);
diff --git a/contrib/citext/citext.c b/contrib/citext/citext.c
index 0ba4782..189105a 100644
--- a/contrib/citext/citext.c
+++ b/contrib/citext/citext.c
@@ -18,6 +18,7 @@ PG_MODULE_MAGIC;
  */
 
 static int32 citextcmp(text *left, text *right, Oid collid);
+static int32 internal_citext_pattern_cmp(text *left, text *right, Oid collid);
 
 /*
  *		=
@@ -59,6 +60,40 @@ citextcmp(text *left, text *right, Oid collid)
 }
 
 /*
+ * citext_pattern_cmp()
+ * Internal character-by-character comparison function for citext strings.
+ * Returns int32 negative, zero, or positive.
+ */
+static int32
+internal_citext_pattern_cmp(text *left, 

Re: [HACKERS] SCRAM in the PG 10 release notes

2017-09-14 Thread Alvaro Hernandez



On 14/09/17 08:57, Heikki Linnakangas wrote:

On 09/12/2017 04:09 AM, Noah Misch wrote:

On Wed, May 10, 2017 at 10:50:51PM -0400, Bruce Momjian wrote:

On Mon, May  1, 2017 at 08:12:51AM -0400, Robert Haas wrote:
On Tue, Apr 25, 2017 at 10:16 PM, Bruce Momjian  
wrote:

Well, we could add "MD5 users are encouraged to switch to
SCRAM-SHA-256".  Now whether we want to list this as something on the
SCRAM-SHA-256 description, or mention it as an incompatibility, or
under Migration.  I am not clear that MD5 is in such terrible 
shape that

this is warranted.


I think it's warranted.  The continuing use of MD5 has been a headache
for some EnterpriseDB customers who have compliance requirements which
they must meet.  It's not that they themselves necessarily know or
care whether MD5 is secure, although in some cases they do; it's that
if they use it, they will be breaking laws or regulations to which
their business or agency is subject.  I imagine customers of other
PostgreSQL companies have similar issues.  But leaving that aside, the
advantage of SCRAM isn't merely that it uses a better algorithm to
hash the password.  It has other advantages also, like not being
vulnerable to replay attacks.  If you're doing password
authentication, you should really be using SCRAM, and encouraging
people to move to SCRAM after upgrading is a good idea.

That having been said, SCRAM is a wire protocol break.  You will not
be able to upgrade to SCRAM unless and until the drivers you use to
connect to the database add support for it.  The only such driver
that's part of libpq; other drivers that have reimplemented the
PostgreSQL wire protocol will have to be updated with SCRAM support
before it will be possible to use SCRAM with those drivers. I think
this should be mentioned in the release notes, too.  I also think it
would be great if somebody would put together a wiki page listing all
the popular drivers and (1) whether they use libpq or reimplement the
wire protocol, and (2) if the latter, the status of any efforts to
implement SCRAM, and (3) if those efforts have been completed, the
version from which they support SCRAM.  Then, I think we should reach
out to all of the maintainers of those driver authors who aren't
moving to support SCRAM and encourage them to do so.


I have added this as an open item because we will have to wait to see
where we are with driver support as the release gets closer.


With the release near, I'm promoting this to the regular open issues 
section.


Thanks.

I updated the list of drivers on the wiki 
(https://wiki.postgresql.org/wiki/List_of_drivers), adding a column 
for whether the driver supports SCRAM authentication. Currently, the 
only non-libpq driver that has implemented SCRAM is the JDBC driver. I 
submitted a patch for the Go driver, but it hasn't been committed yet.


On the JDBC driver, strictly speaking, code has not been released 
yet. It is scheduled for v 42.2.0, and maybe the wiki should also 
mention from what version of the driver it is supported (I guess for all 
cases, unless their versioning would be synced with PostgreSQL's).



Álvaro


--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] uninterruptible state in 10beta4

2017-09-14 Thread Andres Freund
On 2017-09-13 17:53:44 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Indeed that seems plausible. I guess something like the attached should
> > fix the issue?
> 
> Ah, I see you came to the same conclusion I did.  But see comment
> about adding a comment.

I've pushed something - Not sure if you'd anything else in mind WRT
comment...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] uninterruptible state in 10beta4

2017-09-14 Thread Andres Freund
On 2017-09-13 14:46:08 -0700, Jeff Janes wrote:
> On Wed, Sep 13, 2017 at 2:41 PM, Andres Freund  wrote:
> > Indeed that seems plausible. I guess something like the attached should
> > fix the issue?
> >
> 
> Yep, that fixes it.

Cool. Pushed. Thanks for the report!

Thanks,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Amit Langote
On 2017/09/14 16:53, Dean Rasheed wrote:
> On 13 September 2017 at 10:05, Amit Langote
>  wrote:
>> Coincidentally, I just wrote the patch for canonicalizing stored values,
>> instead of erroring out.  Please see attached if that's what you were
>> thinking too.
>>
> 
> Looks reasonable to me, if we decide to go this way.>
> One minor review comment --

Thanks for the review.

> it isn't really necessary to have the
> separate new boolean local variables as well as the datum kind
> variables. Just tracking the datum kind is sufficient and slightly
> simpler. That would also address a slight worry I have that your
> coding might result in a compiler warning about the kind variables
> being used uninitialised with some less intelligent compilers, not
> smart enough to work out that it can't actually happen.

Got rid of the boolean variables in the updated patch.

Thanks,
Amit
From 61bfaeb46623572f5adba0b624d00dfecb6ed495 Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 13 Sep 2017 17:51:38 +0900
Subject: [PATCH] Canonicalize catalog representation of range partition bounds

Since the system effectively ignores values of the bound for
partition key columns following the first unbounded key, it's pointless
to accurately remember them in the system catalog.  Instead store
minvalue or maxvalue for all such columns, matching what the first
unbounded key is.  This makes \d output of range partitions look
more canonical.
---
 doc/src/sgml/ref/create_table.sgml |  6 +-
 src/backend/parser/parse_utilcmd.c | 30 --
 src/test/regress/expected/create_table.out | 20 +---
 src/test/regress/expected/insert.out   |  8 
 src/test/regress/sql/create_table.sql  |  6 ++
 5 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml 
b/doc/src/sgml/ref/create_table.sgml
index 824253de40..7d2eb30722 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -328,7 +328,11 @@ FROM ( { numeric_literal | MAXVALUE in a partition bound are ignored; so the bound
   (10, MINVALUE, 0) is equivalent to
   (10, MINVALUE, 10) and (10, MINVALUE, MINVALUE)
-  and (10, MINVALUE, MAXVALUE).
+  and (10, MINVALUE, MAXVALUE).  Morever, the system will
+  store (10, MINVALUE, MINVALUE) into the system catalog if
+  the user specifies (10, MINVALUE, 0), and
+  (10, MAXVALUE, MAXVALUE) if the user specifies
+  (10, MAXVALUE, 0).
  
 
  
diff --git a/src/backend/parser/parse_utilcmd.c 
b/src/backend/parser/parse_utilcmd.c
index 655da02c10..ca983105cc 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -3381,6 +3381,8 @@ transformPartitionBound(ParseState *pstate, Relation 
parent,
   *cell2;
int i,
j;
+   PartitionRangeDatumKind lower_kind = 
PARTITION_RANGE_DATUM_VALUE,
+   
upper_kind = PARTITION_RANGE_DATUM_VALUE;
 
if (spec->strategy != PARTITION_STRATEGY_RANGE)
ereport(ERROR,
@@ -3426,7 +3428,17 @@ transformPartitionBound(ParseState *pstate, Relation 
parent,
coltype = get_partition_col_typid(key, i);
coltypmod = get_partition_col_typmod(key, i);
 
-   if (ldatum->value)
+   /*
+* If we've found the first infinite bound, use the 
same for
+* subsequent columns.
+*/
+   if (lower_kind != PARTITION_RANGE_DATUM_VALUE)
+   {
+   ldatum = copyObject(ldatum);/* don't 
scribble on input */
+   ldatum->kind = lower_kind;
+   ldatum->value = NULL;
+   }
+   else if (ldatum->value)
{
con = castNode(A_Const, ldatum->value);
value = transformPartitionBoundValue(pstate, 
con,
@@ -3439,8 +3451,20 @@ transformPartitionBound(ParseState *pstate, Relation 
parent,
ldatum = copyObject(ldatum);/* don't 
scribble on input */
ldatum->value = (Node *) value;
}
+   else
+   lower_kind = ldatum->kind;
 
-   if (rdatum->value)
+   /*
+* If we've found the first infinite bound, use the 
same for
+* subsequent columns.
+*/
+   if (upper_kind != PARTITION_RANGE_DATUM_VALUE)
+  

Re: [HACKERS] [POC] hash partitioning

2017-09-14 Thread amul sul
On Wed, Sep 13, 2017 at 7:43 PM, Jesper Pedersen  wrote:

> Hi Amul,
>
> On 09/08/2017 08:40 AM, amul sul wrote:
>
>> Rebased 0002 against this commit & renamed to 0001, PFA.
>>
>>
> This patch needs a rebase.
>
>
Thanks for your note.
​ ​
Attached is the patch rebased on the latest master head.
Also added error on
​creating ​
​d
efault partition
​for the hash partitioned table​
,
and updated document &
​ ​
test script for the same.

​Regards,
Amul​


0001-hash-partitioning_another_design-v19.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POC: Cache data in GetSnapshotData()

2017-09-14 Thread Mithun Cy
On Wed, Sep 13, 2017 at 7:24 PM, Jesper Pedersen
 wrote:
> I have done a run with this patch on a 2S/28C/56T/256Gb w/ 2 x RAID10 SSD
> machine.
>
> Both for -M prepared, and -M prepared -S I'm not seeing any improvements (1
> to 375 clients); e.g. +-1%.

My test was done on an 8 socket NUMA intel machine, where I could
clearly see improvements as posted before.
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):128
On-line CPU(s) list:   0-127
Thread(s) per core:2
Core(s) per socket:8
Socket(s): 8
NUMA node(s):  8
Vendor ID: GenuineIntel
CPU family:6
Model: 47
Model name:Intel(R) Xeon(R) CPU E7- 8830  @ 2.13GHz
Stepping:  2
CPU MHz:   1064.000
BogoMIPS:  4266.62
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  24576K
NUMA node0 CPU(s): 0,65-71,96-103
NUMA node1 CPU(s): 72-79,104-111
NUMA node2 CPU(s): 80-87,112-119
NUMA node3 CPU(s): 88-95,120-127
NUMA node4 CPU(s): 1-8,33-40
NUMA node5 CPU(s): 9-16,41-48
NUMA node6 CPU(s): 17-24,49-56
NUMA node7 CPU(s): 25-32,57-64

Let me recheck if the improvement is due to NUMA or cache sizes.
Currently above machine is not available for me. It will take 2 more
days for same.

> Although the -M prepared -S case should improve, I'm not sure that the extra
> overhead in the -M prepared case is worth the added code complexity.

Each tpcb like (-M prepared) transaction in pgbench have 3 updates 1
insert and 1 select statements. There will be more GetSnapshotData
calls than the end of the transaction (cached snapshot invalidation).
So I think whatever we cache has a higher chance of getting used
before it is invalidated in -M prepared. But on worst cases where we
have quick write transaction which invalidates cached snapshot before
it is reused becomes an overhead.
As Andres has suggested previously I need a mechanism to identify and
avoid caching for such scenarios. I do not have a right solution for
this at present but one thing we can try is just before caching if we
see an exclusive request in wait queue of ProcArrayLock we can avoid
caching.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patches that don't apply or don't compile: 2017-09-12

2017-09-14 Thread Aleksander Alekseev
Hi Martin,

> > === Build Failed: 7 ===
> > Title: Fix the optimization to skip WAL-logging on table created in same 
> > transaction
> > Author: Martijn van Oosterhout 
> > URL: https://commitfest.postgresql.org/14/528/
> 
> I'm not the author of this patch, and the page doesn't say so.
> Something wrong with your script?

It's a bug. The authors were determined by the senders of the first
email in the corresponding thread. This is because I needed email list
to CC the authors but the commitfest application doesn't show emails.

Sorry for the disturbance.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] Optimise default partition scanning while adding new partition

2017-09-14 Thread Jeevan Ladhe
Thanks Amit for reviewing.

Patch looks fine to me.  By the way, why don't we just say "Can we skip
> scanning part_rel?" in the comment before the newly added call to
> PartConstraintImpliedByRelConstraint()?  We don't need to repeat the
> explanation of what it does at the every place we call it.
>

I have changed the comment.
PFA.

Regards,
Jeevan Ladhe


0001-Check-default-partitition-child-validation-scan-is.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Dean Rasheed
On 14 September 2017 at 03:49, Noah Misch  wrote:
> On Wed, Sep 13, 2017 at 12:06:40PM -0400, Robert Haas wrote:
>> OK, thanks.  I still don't really like allowing this, but I can see
>> that compatibility with other systems has some value here, and if
>> nobody else is rejecting these cases, maybe we shouldn't either.  So
>> I'll hold my nose and change my vote to canonicalizing rather than
>> rejecting outright.
>
> I vote for rejecting it.  DDL compatibility is less valuable than other
> compatibility.  The hypothetical affected application can change its DDL to
> placate PostgreSQL and use that modified DDL for all other databases, too.

So we have 3 options:

1. Do nothing, allowing and keeping any values after a MINVALUE/MAXVALUE.

2. Allow the such values, but canonicalise what we store.

3. Reject anything other than MINVALUE/MAXVALUE after MINVALUE/MAXVALUE.


My order of preference is still (1), (2) then (3) because:

- Compatibility.
- Less verbose / easier to type.
- We allow multiple syntaxes for equivalent constraints in other places,
  without canonicalising what the user enters.
- Regarding Robert's coding argument, code in general should not be
  looking at and basing decisions on any values it sees after a
  MINVALUE/MAXVALUE. If it does, at the very least it's doing more work
  than it needs to, and at worst, there's a potential bug there which
  would be more readily exposed by allowing arbitrary values there. Code
  that only worked because because of earlier canonicalisation would
  strike me as being somewhat fragile.

However, it seems that there is a clear consensus towards (2) or (3)
and we have viable patches for each, although I'm not sure which of
those the consensus is really leaning towards.

Robert, since partitioning was originally your commit, and you know
the wider codebase better, I'm happy to go with whatever you decide.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Dean Rasheed
On 13 September 2017 at 14:51, Robert Haas  wrote:
> Coincidentally, I wrote a patch for this too, but mine goes back to
> rejecting MINVALUE or MAXVALUE followed by anything else.
>

LGTM, if we decide to go this way.

One minor review comment -- you missed an example code snippet using
(MINVALUE, 0) further down the page in create_table.sgml, which needs
updating.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Dean Rasheed
On 13 September 2017 at 10:05, Amit Langote
 wrote:
> Coincidentally, I just wrote the patch for canonicalizing stored values,
> instead of erroring out.  Please see attached if that's what you were
> thinking too.
>

Looks reasonable to me, if we decide to go this way.

One minor review comment -- it isn't really necessary to have the
separate new boolean local variables as well as the datum kind
variables. Just tracking the datum kind is sufficient and slightly
simpler. That would also address a slight worry I have that your
coding might result in a compiler warning about the kind variables
being used uninitialised with some less intelligent compilers, not
smart enough to work out that it can't actually happen.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Improve geometric types

2017-09-14 Thread Kyotaro HORIGUCHI
At Tue, 12 Sep 2017 19:30:44 +0200, Emre Hasegeli  wrote in 

[HACKERS] Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-09-14 Thread Tsunakawa, Takayuki
Hello, Robert, Noah

From: Robert Haas [mailto:robertmh...@gmail.com]
> On Fri, Jul 28, 2017 at 1:30 AM, Noah Misch  wrote:
> > We've reached that period.  If anyone is going to push for a change
> > here, now is the time.  Absent such arguments, the behavior won't change.
> 
> Well, I started out believing that the current behavior was for the best,
> and then completely reversed my position and favored the OP's proposal.
> Nothing has really happened since then to change my mind, so I guess I'm
> still in that camp.  But do we have any new data points?  Have any
> beta-testers tested this and what do they think?
> The only non-developer (i.e. person not living in an ivory tower) who has
> weighed in here is Tels, who favored reversing the original decision and
> adopting Tsunakawa-san's position, and that was 2 months ago.
> 
> I am pretty reluctant to tinker with this at this late date and in the face
> of several opposing votes, but I do think that we bet on the wrong horse.

Sorry again, but how can we handle this?  A non-PG-developer, Tels (and 
possibly someone else, IIRC) claimed that the behavior be changed during the 
beta period.  Why should we do nothing?

Regards
Takayuki Tsunakawa





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving some partitioning code to executor

2017-09-14 Thread Amit Langote
Repeating links for better accessibility:

On 2017/09/14 16:13, Amit Langote wrote:
> [1]

https://www.postgresql.org/message-id/CA%2BTgmoafr%3DhUrM%3Dcbx-k%3DBDHOF2OfXaw95HQSNAK4mHBwmSjtw%40mail.gmail.com

> [2]

https://www.postgresql.org/message-id/7fe0007b-7ad1-a593-df11-ad05364ebce4%40lab.ntt.co.jp

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] moving some partitioning code to executor

2017-09-14 Thread Amit Langote
Hi.

It seems to me that some of the code in partition.c is better placed
somewhere under the executor directory.  There was even a suggestion
recently [1] to introduce a execPartition.c to house some code around
tuple-routing.

IMO, catalog/partition.c should present an interface for handling
operations on a *single* partitioned table and avoid pretending to support
any operations on the whole partition tree.  For example, the
PartitionDispatch structure embeds some knowledge about the partition tree
it's part of, which is useful when used for tuple-routing, because of the
way it works now (lock and form ResultRelInfos of *all* leaf partitions
before the first input row is processed).

So, let's move that structure, along with the code that creates and
manipulates the same, out of partition.c/h and to execPartition.c/h.
Attached patch attempts to do that.

While doing the same, I didn't move *all* of get_partition_for_tuple() out
to execPartition.c, instead modified its signature as shown below:

-extern int get_partition_for_tuple(PartitionDispatch *pd,
-TupleTableSlot *slot,
-EState *estate,
-PartitionDispatchData **failed_at,
-TupleTableSlot **failed_slot);
+extern int get_partition_for_tuple(Relation relation, Datum *values,
+bool *isnull);

That way, we keep the core partition bound comparison logic inside
partition.c and move rest of the stuff to its caller ExecFindPartition(),
which includes navigating the enveloping PartitionDispatch's.

Thoughts?

PS: 0001 of the attached is the patch from [2] which is here to be applied
on HEAD before applying the main patch (0002) itself

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BTgmoafr%3DhUrM%3Dcbx-k%3DBDHOF2OfXa
w95HQSNAK4mHBwmSjtw%40mail.gmail.com

[2]
https://www.postgresql.org/message-id/7fe0007b-7ad1-a593-df11-ad05364ebce4%40l
ab.ntt.co.jp
From cb16fb6c89f60a8cf6b8f713ca9c831fe3824b2d Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 8 Sep 2017 17:35:10 +0900
Subject: [PATCH 1/2] Make RelationGetPartitionDispatch expansion order
 depth-first

This is so as it matches what the planner is doing with partitioning
inheritance expansion.  Matching with planner order helps because
it helps ease matching the executor's per-partition objects with
planner-created per-partition nodes.
---
 src/backend/catalog/partition.c | 242 
 1 file changed, 99 insertions(+), 143 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 73eff17202..ddb46a80cb 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -147,6 +147,8 @@ static int32 partition_bound_cmp(PartitionKey key,
 static int partition_bound_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
void *probe, bool 
probe_is_bound, bool *is_equal);
+static void get_partition_dispatch_recurse(Relation rel, Relation parent,
+  List **pds, List 
**leaf_part_oids);
 
 /*
  * RelationBuildPartitionDesc
@@ -1192,21 +1194,6 @@ get_partition_qual_relid(Oid relid)
 }
 
 /*
- * Append OIDs of rel's partitions to the list 'partoids' and for each OID,
- * append pointer rel to the list 'parents'.
- */
-#define APPEND_REL_PARTITION_OIDS(rel, partoids, parents) \
-   do\
-   {\
-   int i;\
-   for (i = 0; i < (rel)->rd_partdesc->nparts; i++)\
-   {\
-   (partoids) = lappend_oid((partoids), 
(rel)->rd_partdesc->oids[i]);\
-   (parents) = lappend((parents), (rel));\
-   }\
-   } while(0)
-
-/*
  * RelationGetPartitionDispatchInfo
  * Returns information necessary to route tuples down a partition 
tree
  *
@@ -1222,151 +1209,120 @@ PartitionDispatch *
 RelationGetPartitionDispatchInfo(Relation rel,
 int 
*num_parted, List **leaf_part_oids)
 {
+   List   *pdlist;
PartitionDispatchData **pd;
-   List   *all_parts = NIL,
-  *all_parents = NIL,
-  *parted_rels,
-  *parted_rel_parents;
-   ListCell   *lc1,
-  *lc2;
-   int i,
-   k,
-   offset;
+   ListCell *lc;
+   int i;
 
-   /*
-* We rely on the relcache to traverse the partition tree to build both
-* the leaf partition OIDs list and the array of PartitionDispatch 
objects
-* for the partitioned tables in the tree.  That means every partitioned
-* table in the tree must be locked, which is fine since we require the
-   

Re: [HACKERS] SCRAM in the PG 10 release notes

2017-09-14 Thread Michael Paquier
On Thu, Sep 14, 2017 at 3:57 PM, Heikki Linnakangas  wrote:
> I updated the list of drivers on the wiki
> (https://wiki.postgresql.org/wiki/List_of_drivers), adding a column for
> whether the driver supports SCRAM authentication. Currently, the only
> non-libpq driver that has implemented SCRAM is the JDBC driver. I submitted
> a patch for the Go driver, but it hasn't been committed yet.

Thanks Heikki for updating. Sorry for slacking on this item.

Regarding hte Erlang driver, http://glozer.net/src/epgsql/ and
http://tilleuropa.se/pgsql are dead links, and I have found one erland
driver on github which is rather active:
https://github.com/epgsql/epgsql
This does not use libpq, and it does not support the sasl protocol.

The third driver
(https://code.google.com/archive/p/erlang-psql-driver/) exists but it
is claimed as unmaintained. Wouldn't it be better to just remove all
three from the list and replace them by the github one?

For node-postgres, I would recommend updating the third column to tell
"yes" if libpq is used, and "no" otherwise.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-14 Thread Michael Paquier
On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote
 wrote:
> Sure, no problem.

OK, I took a closer look at all patches, but did not run any manual
tests to test the compression except some stuff with
wal_consistency_checking.

+   if (opaque->flags & GIN_DELETED)
+   mask_page_content(page);
+   else if (pagehdr->pd_lower != 0)
+   mask_unused_space(page);
[...]
+   /* Mask the unused space, provided the page's pd_lower is set. */
+   if (pagehdr->pd_lower != 0)
mask_unused_space(page);
For the masking functions, shouldn't those check use (pd_lower >
SizeOfPageHeaderData) instead of 0? pd_lower gets set to a non-zero
value on HEAD, so you would apply the masking even if the meta page is
upgraded from an instance that did not enforce the value of pd_lower
later on. Those conditions also definitely need comments. That will be
a good reminder so as why it needs to be kept.

+*
+* This won't be of any help unless we use option like REGBUF_STANDARD
+* while registering the buffer for metapage as otherwise, it won't try to
+* remove the hole while logging the full page image.
 */
This comment is in the btree code. But you actually add
REGBUF_STANDARD. So I think that this could be just removed.

 * Set pd_lower just past the end of the metadata.  This is not essential
-* but it makes the page look compressible to xlog.c.
+* but it makes the page look compressible to xlog.c.  See
+* _bt_initmetapage.
This reference could be removed as well as _bt_initmetapage does not
provide any information, the existing comment is wrong without your
patch, and then becomes right with this patch.

After that I have spotted a couple of places for btree, hash and
SpGist where the updates of pd_lower are not happening. Let's keep in
mind that updated meta pages could come from an upgraded version, so
we need to be careful here about all code paths updating meta pages,
and WAL-logging them.

It seems to me that an update of pd_lower is missing in _bt_getroot(),
just before marking the buffer as dirty I think. And there is a second
in _bt_unlink_halfdead_page(), a third in _bt_insertonpg() and finally
one in _bt_newroot().

For SpGist, I think that there are two missing: spgbuild() and spgGetCache().

For hash, hashbulkdelete(), _hash_vacuum_one_page(),
_hash_addovflpage(), _hash_freeovflpage() and _hash_doinsert() are
missing the shot, no? We could have a meta page of a hash index
upgraded from PG10.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM in the PG 10 release notes

2017-09-14 Thread Heikki Linnakangas

On 09/12/2017 04:09 AM, Noah Misch wrote:

On Wed, May 10, 2017 at 10:50:51PM -0400, Bruce Momjian wrote:

On Mon, May  1, 2017 at 08:12:51AM -0400, Robert Haas wrote:

On Tue, Apr 25, 2017 at 10:16 PM, Bruce Momjian  wrote:

Well, we could add "MD5 users are encouraged to switch to
SCRAM-SHA-256".  Now whether we want to list this as something on the
SCRAM-SHA-256 description, or mention it as an incompatibility, or
under Migration.  I am not clear that MD5 is in such terrible shape that
this is warranted.


I think it's warranted.  The continuing use of MD5 has been a headache
for some EnterpriseDB customers who have compliance requirements which
they must meet.  It's not that they themselves necessarily know or
care whether MD5 is secure, although in some cases they do; it's that
if they use it, they will be breaking laws or regulations to which
their business or agency is subject.  I imagine customers of other
PostgreSQL companies have similar issues.  But leaving that aside, the
advantage of SCRAM isn't merely that it uses a better algorithm to
hash the password.  It has other advantages also, like not being
vulnerable to replay attacks.  If you're doing password
authentication, you should really be using SCRAM, and encouraging
people to move to SCRAM after upgrading is a good idea.

That having been said, SCRAM is a wire protocol break.  You will not
be able to upgrade to SCRAM unless and until the drivers you use to
connect to the database add support for it.  The only such driver
that's part of libpq; other drivers that have reimplemented the
PostgreSQL wire protocol will have to be updated with SCRAM support
before it will be possible to use SCRAM with those drivers.  I think
this should be mentioned in the release notes, too.  I also think it
would be great if somebody would put together a wiki page listing all
the popular drivers and (1) whether they use libpq or reimplement the
wire protocol, and (2) if the latter, the status of any efforts to
implement SCRAM, and (3) if those efforts have been completed, the
version from which they support SCRAM.  Then, I think we should reach
out to all of the maintainers of those driver authors who aren't
moving to support SCRAM and encourage them to do so.


I have added this as an open item because we will have to wait to see
where we are with driver support as the release gets closer.


With the release near, I'm promoting this to the regular open issues section.


Thanks.

I updated the list of drivers on the wiki 
(https://wiki.postgresql.org/wiki/List_of_drivers), adding a column for 
whether the driver supports SCRAM authentication. Currently, the only 
non-libpq driver that has implemented SCRAM is the JDBC driver. I 
submitted a patch for the Go driver, but it hasn't been committed yet.


As for a recommendation in the release notes, maybe something like 
"Installations using MD5 authentication are encouraged to switch to 
SCRAM-SHA-256, unless using older client programs or drivers that don't 
support it yet."


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Binary search in fmgr_isbuiltin() is a bottleneck.

2017-09-14 Thread Andres Freund
Hi,

Surprising myself I discovered that in workloads that do a large number
of fmgr_info* lookups, fmgr_isbuiltin() is actually quite the
bottleneck.

In my development build we have 2765 builtin functions, stored in a 88KB
array. Apparently the ~12 steps are quite noticeable. Generally binary
searches have quite a poor memory access pattern...

In the workload I playing around with right now (producing this wall of
performance fixes) the main source of lookups is
printtup_prepare_info(), which does a fmgr_info for every column. If you
have a large number of columns ...

I think we could conceivable deduplicate the output functions for
printtup to one FmgrInfo per type? I'd assume that it'd be good for
things besides reducing the overhead - alowing the respective function
to reuse fn_extra might be quite good.  I've not implemented that idea
at this point, I'm not 100% what the best way to do so is without also
causing slowdowns.

Another idea would be to have an array of FmgrBuiltin*, that we index by
oid. That'd not be super small though, given that the space for function
oids is sparse.

Thus what I've instead done is replacing the binary search in
fmgr_isbuiltin() with a simplehash.h style hashtable. After that the
lookup is still visible in the profile, but far less prominent.

I'd like to move the hash creation out of fmgr_isbuiltin (to avoid
having to check whether it's already created), but there's currently no
convenient place to create the hash from.   Now that we don't rely on
the sortedness of fmgrtab.c we could remove a few lines from
Gen_fmgrtab.pl, but I don't quite see the advantage. If we were
interested in a faster by-name lookup we could sort it by name, but
that'd be better solved by another hashtable...


I was wondering about also replacing the C function hash with a
simplehash, but given that I've not seen this in profiles, I've not
bothered so far.

Greetings,

Andres Freund
>From 2b3e06380d5a339efc94e748aa57985d3bb80223 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 18:43:46 -0700
Subject: [PATCH 4/8] Add inline murmurhash32(int32) function.

The function already existed in tidbitmap.c but more users requiring
fast hashing of 32bit ints are coming up.
---
 src/backend/nodes/tidbitmap.c | 20 ++--
 src/include/utils/hashutils.h | 18 ++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index c4e53adb0c..01d6bc5c11 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -45,6 +45,7 @@
 #include "nodes/tidbitmap.h"
 #include "storage/lwlock.h"
 #include "utils/dsa.h"
+#include "utils/hashutils.h"
 
 /*
  * The maximum number of tuples per page is not large (typically 256 with
@@ -237,30 +238,13 @@ static int	tbm_comparator(const void *left, const void *right);
 static int tbm_shared_comparator(const void *left, const void *right,
 	  void *arg);
 
-/*
- * Simple inline murmur hash implementation for the exact width required, for
- * performance.
- */
-static inline uint32
-hash_blockno(BlockNumber b)
-{
-	uint32		h = b;
-
-	h ^= h >> 16;
-	h *= 0x85ebca6b;
-	h ^= h >> 13;
-	h *= 0xc2b2ae35;
-	h ^= h >> 16;
-	return h;
-}
-
 /* define hashtable mapping block numbers to PagetableEntry's */
 #define SH_USE_NONDEFAULT_ALLOCATOR
 #define SH_PREFIX pagetable
 #define SH_ELEMENT_TYPE PagetableEntry
 #define SH_KEY_TYPE BlockNumber
 #define SH_KEY blockno
-#define SH_HASH_KEY(tb, key) hash_blockno(key)
+#define SH_HASH_KEY(tb, key) murmurhash32(key)
 #define SH_EQUAL(tb, a, b) a == b
 #define SH_SCOPE static inline
 #define SH_DEFINE
diff --git a/src/include/utils/hashutils.h b/src/include/utils/hashutils.h
index 56b7bfc9cb..35281689e8 100644
--- a/src/include/utils/hashutils.h
+++ b/src/include/utils/hashutils.h
@@ -20,4 +20,22 @@ hash_combine(uint32 a, uint32 b)
 	return a;
 }
 
+
+/*
+ * Simple inline murmur hash implementation hashing a 32 bit ingeger, for
+ * performance.
+ */
+static inline uint32
+murmurhash32(uint32 data)
+{
+	uint32		h = data;
+
+	h ^= h >> 16;
+	h *= 0x85ebca6b;
+	h ^= h >> 13;
+	h *= 0xc2b2ae35;
+	h ^= h >> 16;
+	return h;
+}
+
 #endif			/* HASHUTILS_H */
-- 
2.14.1.536.g6867272d5b.dirty

>From 703ddd56fb484692c84101d1731e60f9ea81193f Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 19:43:02 -0700
Subject: [PATCH 5/8] Replace binary search in fmgr_isbuiltin with hashtable.

Turns out we have enough functions that the binary search is quite
noticeable in profiles.

It'd be nice if there were a better place to build the hashtable than
the first pass through fmgr_isbuiltin() but there's currently none.
---
 src/backend/utils/fmgr/fmgr.c | 63 ---
 1 file changed, 47 insertions(+), 16 deletions(-)

diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index a7b07827e0..87867f2183 100644

Re: [HACKERS] WAL logging problem in 9.4.3?

2017-09-14 Thread Kyotaro HORIGUCHI
At Wed, 13 Sep 2017 17:42:39 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170913.174239.25978735.horiguchi.kyot...@lab.ntt.co.jp>
> filterdiff seems to did something wrong..

# to did...

The patch is broken by filterdiff so I send a new patch made
directly by git format-patch. I confirmed that a build completes
with applying this.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 7086b5855080065f73de4d099cbaab09511f01fc Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 12 Sep 2017 13:01:33 +0900
Subject: [PATCH] Fix WAL logging problem

---
 src/backend/access/heap/heapam.c| 113 +---
 src/backend/access/heap/pruneheap.c |   3 +-
 src/backend/access/heap/rewriteheap.c   |   4 +-
 src/backend/access/heap/visibilitymap.c |   3 +-
 src/backend/access/transam/xact.c   |   7 +
 src/backend/catalog/storage.c   | 318 +---
 src/backend/commands/copy.c |  13 +-
 src/backend/commands/createas.c |   9 +-
 src/backend/commands/matview.c  |   6 +-
 src/backend/commands/tablecmds.c|   8 +-
 src/backend/commands/vacuumlazy.c   |   6 +-
 src/backend/storage/buffer/bufmgr.c |  40 +++-
 src/backend/utils/cache/relcache.c  |  13 ++
 src/include/access/heapam.h |   8 +-
 src/include/catalog/storage.h   |   5 +-
 src/include/storage/bufmgr.h|   2 +
 src/include/utils/rel.h |   8 +
 17 files changed, 476 insertions(+), 90 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index d20f038..e40254d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -34,6 +34,28 @@
  *	  the POSTGRES heap access method used for all POSTGRES
  *	  relations.
  *
+ * WAL CONSIDERATIONS
+ *	  All heap operations are normally WAL-logged. but there are a few
+ *	  exceptions. Temporary and unlogged relations never need to be
+ *	  WAL-logged, but we can also skip WAL-logging for a table that was
+ *	  created in the same transaction, if we don't need WAL for PITR or
+ *	  WAL archival purposes (i.e. if wal_level=minimal), and we fsync()
+ *	  the file to disk at COMMIT instead.
+ *
+ *	  The same-relation optimization is not employed automatically on all
+ *	  updates to a table that was created in the same transacton, because
+ *	  for a small number of changes, it's cheaper to just create the WAL
+ *	  records than fsyncing() the whole relation at COMMIT. It is only
+ *	  worthwhile for (presumably) large operations like COPY, CLUSTER,
+ *	  or VACUUM FULL. Use heap_register_sync() to initiate such an
+ *	  operation; it will cause any subsequent updates to the table to skip
+ *	  WAL-logging, if possible, and cause the heap to be synced to disk at
+ *	  COMMIT.
+ *
+ *	  To make that work, all modifications to heap must use
+ *	  HeapNeedsWAL() to check if WAL-logging is needed in this transaction
+ *	  for the given block.
+ *
  *-
  */
 #include "postgres.h"
@@ -56,6 +78,7 @@
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/storage.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -2373,12 +2396,6 @@ ReleaseBulkInsertStatePin(BulkInsertState bistate)
  * The new tuple is stamped with current transaction ID and the specified
  * command ID.
  *
- * If the HEAP_INSERT_SKIP_WAL option is specified, the new tuple is not
- * logged in WAL, even for a non-temp relation.  Safe usage of this behavior
- * requires that we arrange that all new tuples go into new pages not
- * containing any tuples from other transactions, and that the relation gets
- * fsync'd before commit.  (See also heap_sync() comments)
- *
  * The HEAP_INSERT_SKIP_FSM option is passed directly to
  * RelationGetBufferForTuple, which see for more info.
  *
@@ -2409,6 +2426,7 @@ ReleaseBulkInsertStatePin(BulkInsertState bistate)
  * TID where the tuple was stored.  But note that any toasting of fields
  * within the tuple data is NOT reflected into *tup.
  */
+extern HTAB *pendingSyncs;
 Oid
 heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 			int options, BulkInsertState bistate)
@@ -2482,7 +2500,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	MarkBufferDirty(buffer);
 
 	/* XLOG stuff */
-	if (!(options & HEAP_INSERT_SKIP_WAL) && RelationNeedsWAL(relation))
+	if (BufferNeedsWAL(relation, buffer))
 	{
 		xl_heap_insert xlrec;
 		xl_heap_header xlhdr;
@@ -2681,12 +2699,10 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
 	int			ndone;
 	char	   *scratch = NULL;
 	Page		page;
-	bool		needwal;
 	Size		saveFreeSpace;
 	bool		need_tuple_data = RelationIsLogicallyLogged(relation);
 	bool		need_cids = 

[HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-09-14 Thread Andres Freund
Hi,

When running workloads that include fast queries with a lot of columns,
SendRowDescriptionMessage(), and the routines it calls, becomes a
bottleneck.  Besides syscache lookups (see [1] and [2]) a major cost of
that is manipulation of the StringBuffer and the version specific
branches in the per-attribute loop.  As so often, the performance
differential of this patch gets bigger when the other performance
patches are applied.

The issues in SendRowDescriptionMessage() are the following:

1) All the pq_sendint calls, and also the pq_sendstring, are
   expensive. The amount of calculations done for a single 2/4 byte
   addition to the stringbuffer (particularly enlargeStringInfo()) is
   problematic, as are the reallocations themselves.

   I addressed this by adding pq_send*_pre() wrappers, implemented as
   inline functions, that require that memory is pre-allocated.
   Combining that with doing a enlargeStringInfo() in
   SendRowDescriptionMessage() that pre-allocates the maximum required
   memory, yields pretty good speedup.

   I'm not yet super sure about the implementation. For one, I'm not
   sure this shouldn't instead be stringinfo.h functions, with very very
   tiny pqformat.h wrappers. But conversely I think it'd make a lot of
   sense for the pqformat integer functions to get rid of the
   continually maintained trailing null-byte - I was hoping the compiler
   could optimize that away, but alas, no luck.  As soon as a single
   integer is sent, you can't rely on 0 terminated strings anyway.

2) It creates a new StringInfo in every iteration. That causes
   noticeable memory management overhead, and restarts the size of the
   buffer every time. When the description is relatively large, that
   leads to a number of reallocations for every
   SendRowDescriptionMessage() call.

   My solution here was to create persistent StringInfo for
   SendRowDescriptionMessage() that never gets deallocated (just
   reset). That in combination with new versions of
   pq_beginmessage/endmessage that keep the buffer alive, yields a nice
   speedup.

   Currently I'm using a static variable to allocate a string buffer for
   the function. It'd probably better to manage that outside of a single
   function - I'm also wondering why we're allocating a good number of
   stringinfos in various places, rather than reuse them. Most of them
   can't be entered recursively, and even if that's a concern, we could
   have one reusable per portal or such.

3) The v2/v3 branches in the attribute loop are noticeable (others too,
   but well...).  I solved that by splitting out the v2 and v3
   per-attribute loops into separate functions.  Imo also a good chunk
   more readable.

Comments?

Greetings,

Andres Freund

[1] 
http://archives.postgresql.org/message-id/ca+tgmobj72e_tg6w98h0oubccumoc4urmjocypbnwc2rxya...@mail.gmail.com
[2] 
http://archives.postgresql.org/message-id/20170914061207.zxotvyopetm7lrrp%40alap3.anarazel.de
>From 672cbfd0660e4d2b2cc6980f3f5c2af27e692404 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 18:39:24 -0700
Subject: [PATCH 2/8] Add more efficient functions to pqformat API.

New inline functions allow to add data to a stringbuf in a more
efficient manner by pre-allocating ahead of time, and
pq_beginmessage_pre/pq_endmessage_keep allow reuse of a stringbuffer.
---
 src/backend/libpq/pqformat.c   | 37 +
 src/backend/utils/mb/mbutils.c | 11 --
 src/include/libpq/pqformat.h   | 47 ++
 src/include/mb/pg_wchar.h  | 11 ++
 4 files changed, 95 insertions(+), 11 deletions(-)

diff --git a/src/backend/libpq/pqformat.c b/src/backend/libpq/pqformat.c
index c8cf67c041..6e40ee087c 100644
--- a/src/backend/libpq/pqformat.c
+++ b/src/backend/libpq/pqformat.c
@@ -97,6 +97,28 @@ pq_beginmessage(StringInfo buf, char msgtype)
 	buf->cursor = msgtype;
 }
 
+/* 
+
+ *		pq_beginmessage_pre - initialize for sending a message, reuse buffer
+ *
+ * This requires the buffer to be allocated in an sufficiently long-lived
+ * memory context.
+ * 
+ */
+void
+pq_beginmessage_pre(StringInfo buf, char msgtype)
+{
+	resetStringInfo(buf);
+
+	/*
+	 * We stash the message type into the buffer's cursor field, expecting
+	 * that the pq_sendXXX routines won't touch it.  We could alternatively
+	 * make it the first byte of the buffer contents, but this seems easier.
+	 */
+	buf->cursor = msgtype;
+}
+
+
 /* 
  *		pq_sendbyte		- append a raw byte to a StringInfo buffer
  * 
@@ -350,6 +372,21 @@ pq_endmessage(StringInfo buf)
 	buf->data = NULL;
 }
 
+/* 
+ *		pq_endmessage_keep	- send the completed message to the frontend
+ *
+ * The data buffer is *not* freed, allowing to reuse the buffer with
+ * pg_beginmessage_pre.
+ 

[HACKERS] Improve catcache/syscache performance.

2017-09-14 Thread Andres Freund
Hi,

There's plenty workloads where SearchSysCache()/SearchCatCache() shows
up as a major runtime factor. Primarily in workloads with very fast
queries.

A fair while ago, before I had my commit bit, I'd posted [1]. Looking at
the profiles/benchmarks I was easily able to confirm that it still
helps, but that there's also still a lot left on the table.

Attached is a patch that tries to improve sys/catcache performance,
going further than the patch referenced earlier.

This primarily includes four pieces:

1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Only initializing the ScanKey when necessary, i.e. catcache misses,
   reduces cache unnecessary cpu cache misses.
3) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
4) Split of the heap lookup from the hash lookup, reducing stack
   allocations etc in the common case.

There's further potential:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles.
- As oid is the only system column supported, avoid the use of
  heap_getsysattr(), by adding an explicit branch for
  ObjectIdAttributeNumber. This shows up in profiles.
- move cache initialization out of the search path
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

This patch gives me roughly 8% speedup in a workload that consists out
of a fast query that returns a lot of columns.  If I apply a few
other performance patches, this patch itself starts to make a bigger
difference, of around 11%.

Greetings,

Andres Freund

[1] 
https://www.postgresql.org/message-id/20130905191323.gc490...@alap2.anarazel.de
>From 2b3e06380d5a339efc94e748aa57985d3bb80223 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 18:43:46 -0700
Subject: [PATCH 4/8] Add inline murmurhash32(int32) function.

The function already existed in tidbitmap.c but more users requiring
fast hashing of 32bit ints are coming up.
---
 src/backend/nodes/tidbitmap.c | 20 ++--
 src/include/utils/hashutils.h | 18 ++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index c4e53adb0c..01d6bc5c11 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -45,6 +45,7 @@
 #include "nodes/tidbitmap.h"
 #include "storage/lwlock.h"
 #include "utils/dsa.h"
+#include "utils/hashutils.h"
 
 /*
  * The maximum number of tuples per page is not large (typically 256 with
@@ -237,30 +238,13 @@ static int	tbm_comparator(const void *left, const void *right);
 static int tbm_shared_comparator(const void *left, const void *right,
 	  void *arg);
 
-/*
- * Simple inline murmur hash implementation for the exact width required, for
- * performance.
- */
-static inline uint32
-hash_blockno(BlockNumber b)
-{
-	uint32		h = b;
-
-	h ^= h >> 16;
-	h *= 0x85ebca6b;
-	h ^= h >> 13;
-	h *= 0xc2b2ae35;
-	h ^= h >> 16;
-	return h;
-}
-
 /* define hashtable mapping block numbers to PagetableEntry's */
 #define SH_USE_NONDEFAULT_ALLOCATOR
 #define SH_PREFIX pagetable
 #define SH_ELEMENT_TYPE PagetableEntry
 #define SH_KEY_TYPE BlockNumber
 #define SH_KEY blockno
-#define SH_HASH_KEY(tb, key) hash_blockno(key)
+#define SH_HASH_KEY(tb, key) murmurhash32(key)
 #define SH_EQUAL(tb, a, b) a == b
 #define SH_SCOPE static inline
 #define SH_DEFINE
diff --git a/src/include/utils/hashutils.h b/src/include/utils/hashutils.h
index 56b7bfc9cb..35281689e8 100644
--- a/src/include/utils/hashutils.h
+++ b/src/include/utils/hashutils.h
@@ -20,4 +20,22 @@ hash_combine(uint32 a, uint32 b)
 	return a;
 }
 
+
+/*
+ * Simple inline murmur hash implementation hashing a 32 bit ingeger, for
+ * performance.
+ */
+static inline uint32
+murmurhash32(uint32 data)
+{
+	uint32		h = data;
+
+	h ^= h >> 16;
+	h *= 0x85ebca6b;
+	h ^= h >> 13;
+	h *= 0xc2b2ae35;
+	h ^= h >> 16;
+	return h;
+}
+
 #endif			/* HASHUTILS_H */
-- 
2.14.1.536.g6867272d5b.dirty

>From 0feaa0c4a39b3e0e995cd5897cfd3ebba6a92c48 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 19:58:43 -0700
Subject: [PATCH 6/8] Add pg_noinline macro to c.h.

Forcing a function not to be inlined can be useful if it's the
slow-path of a performance critical function, or should be visible in
profiles to allow for proper cost attribution.

Author: Andres Freund
Discussion: https://postgr.es/m/
---
 src/include/c.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/include/c.h 

Re: [HACKERS] More efficient truncation of pg_stat_activity query strings

2017-09-14 Thread Andres Freund
On 2017-09-12 00:19:48 -0700, Andres Freund wrote:
> Hi,
> 
> I've recently seen a benchmark in which pg_mbcliplen() showed up
> prominently. Which it will basically in any benchmark with longer query
> strings, but fast queries. That's not that uncommon.
> 
> I wonder if we could avoid the cost of pg_mbcliplen() from within
> pgstat_report_activity(), by moving some of the cost to the read
> side. pgstat values are obviously read far less frequently in nearly all
> cases that are performance relevant.
> 
> Therefore I wonder if we couldn't just store a querystring that's
> essentially just a memcpy()ed prefix, and do a pg_mbcliplen() on the
> read side.  I think that should work because all *server side* encodings
> store character lengths in the *first* byte of a multibyte character
> (at least one clientside encoding, gb18030, doesn't behave that way).
> 
> That'd necessitate an added memory copy in pg_stat_get_activity(), but
> that seems fairly harmless.
> 
> Faults in my thinking?

Here's a patch that implements that idea.  Seems to work well.  I'm a
bit loathe to add proper regression tests for this, seems awfully
dependent on specific track_activity_query_size settings.  I did confirm
using gdb that I see incomplete characters before
pgstat_clip_activity(), but not after.

I've renamed st_activity to st_activity_raw to increase the likelihood
that potential external users of st_activity notice and adapt. Increases
the noise, but imo to a very bareable amount. Don't feel strongly
though.

Greetings,

Andres Freund
>From 9c9503f0dfe1babb21e81c1955e996ad06c93608 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 13 Sep 2017 19:25:34 -0700
Subject: [PATCH 1/8] Speedup pgstat_report_activity by moving mb-aware
 truncation to read side.

Previously multi-byte aware truncation was done on every
pgstat_report_activity() call - proving to be a bottleneck for
workloads with long query strings that execute quickly.

Instead move the truncation to the read side, which is commonly
executed far less frequently. That's possible because all server
encodings allow to determine the length of a multi-byte string from
the first byte.

Author: Andres Freund
Discussion: https://postgr.es/m/20170912071948.pa7igbpkkkvie...@alap3.anarazel.de
---
 src/backend/postmaster/pgstat.c | 63 -
 src/backend/utils/adt/pgstatfuncs.c | 17 +++---
 src/include/pgstat.h| 12 +--
 3 files changed, 72 insertions(+), 20 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index accf302cf7..ccb528e627 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2701,7 +2701,7 @@ CreateSharedBackendStatus(void)
 		buffer = BackendActivityBuffer;
 		for (i = 0; i < NumBackendStatSlots; i++)
 		{
-			BackendStatusArray[i].st_activity = buffer;
+			BackendStatusArray[i].st_activity_raw = buffer;
 			buffer += pgstat_track_activity_query_size;
 		}
 	}
@@ -2922,11 +2922,11 @@ pgstat_bestart(void)
 #endif
 	beentry->st_state = STATE_UNDEFINED;
 	beentry->st_appname[0] = '\0';
-	beentry->st_activity[0] = '\0';
+	beentry->st_activity_raw[0] = '\0';
 	/* Also make sure the last byte in each string area is always 0 */
 	beentry->st_clienthostname[NAMEDATALEN - 1] = '\0';
 	beentry->st_appname[NAMEDATALEN - 1] = '\0';
-	beentry->st_activity[pgstat_track_activity_query_size - 1] = '\0';
+	beentry->st_activity_raw[pgstat_track_activity_query_size - 1] = '\0';
 	beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
 	beentry->st_progress_command_target = InvalidOid;
 
@@ -3017,7 +3017,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 			pgstat_increment_changecount_before(beentry);
 			beentry->st_state = STATE_DISABLED;
 			beentry->st_state_start_timestamp = 0;
-			beentry->st_activity[0] = '\0';
+			beentry->st_activity_raw[0] = '\0';
 			beentry->st_activity_start_timestamp = 0;
 			/* st_xact_start_timestamp and wait_event_info are also disabled */
 			beentry->st_xact_start_timestamp = 0;
@@ -3034,8 +3034,12 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	start_timestamp = GetCurrentStatementStartTimestamp();
 	if (cmd_str != NULL)
 	{
-		len = pg_mbcliplen(cmd_str, strlen(cmd_str),
-		   pgstat_track_activity_query_size - 1);
+		/*
+		 * Compute length of to-be-stored string unaware of multi-byte
+		 * characters. For speed reasons that'll get corrected on read, rather
+		 * than computed every write.
+		 */
+		len = Min(strlen(cmd_str), pgstat_track_activity_query_size - 1);
 	}
 	current_timestamp = GetCurrentTimestamp();
 
@@ -3049,8 +3053,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 
 	if (cmd_str != NULL)
 	{
-		memcpy((char *) beentry->st_activity, cmd_str, len);
-		beentry->st_activity[len] = '\0';
+		memcpy((char *) beentry->st_activity_raw, cmd_str, len);
+		beentry->st_activity_raw[len] = '\0';
 		

<    1   2