date:20170809

[HACKERS] GSoC 2017: weekly progress reports (week 9 and week 10)

2017-08-09 Thread Shubham Barai

Project: Explicitly support predicate locks in index AMs besides b-tree

Hi,

In the last two weeks, I mostly worked on predicate locking in rum index.
Rum is based on gin access method. The main difference between rum and
gin is that rum stores additional information in posting tree to perform
a fast full-text search using tsvector and tsquery.

Also, rum has different scanning strategies that include
scangetItemRegular (gets the next heap pointer from scan),
scangetItemfast (gets the next item pointer using fast scan),
scangetItemfull (gets the next item pointer using full index scan).

We have to insert a call for PredicateLockPage at all appropriate places
where a leaf page of entry tree or posting tree is scanned.
Unlike gin, rum doesn't support fast update, so we don't have to worry
about acquiring a relation level lock.

In summary, I have done following things in last two weeks.

1) read the source code of rum to understand its access method

2) found appropriate places to insert calls to existing functions

3) modified  some function prototypes to include snapshot parameter
 (rum was forked before snapshot feature was added to PostgreSQL, and
  snapshot parameter is needed for predicate locking)

4) created tests (to verify serialization failures and to demonstrate the
feature of reduced false positives)

5) debugged and fixed one issue in the previous patch for gin index.

6) created tests for predicate locking in B-tree.

link to the code for rum:
https://github.com/shubhambaraiss/rum/commit/e28e31d79fa6a0d1b6ef4bb7d7e5d0d334a069d9


Regards,
Shubham



 Sent with Mailtrack

Re: [HACKERS] dubious error message from partition.c

2017-08-09 Thread Dean Rasheed

On 9 August 2017 at 13:03, Robert Haas  wrote:
> On Tue, Aug 8, 2017 at 11:34 PM, Tom Lane  wrote:
>> A small suggestion is that it'd be better to write it like "Specified
>> upper bound \"%s\" precedes lower bound \"%s\"."  I think "succeeds" has
>> more alternate meanings than "precedes", so the wording you have seems
>> more confusing than it needs to be.  (Of course, the situation could be
>> the opposite in other languages, but translators have the ability to
>> reverse the ordering if they need to.)
>
> I think that doesn't quite work, because the failure is caused by LB
> <= UB, not LB < UB.  We could fix that by writing "precedes or equals"
> but that seems lame.  Maybe:
>
> Lower bound %s does not precede upper bound %s.
>

There was an earlier suggestion to use "greater than or equal to". I
think that would work quite well:

ERROR:  invalid range bounds for partition \"%s\"
DETAIL:  lower bound %s is greater than or equal to upper bound %s.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Failover Slots

2017-08-09 Thread Robert Haas

On Tue, Aug 8, 2017 at 4:00 AM, Craig Ringer  wrote:
>> - When a standby connects to a master, it can optionally supply a list
>> of slot names that it cares about.
>
> Wouldn't that immediately exclude use for PITR and snapshot recovery? I have
> people right now who want the ability to promote a PITR-recovered snapshot
> into place of a logical replication master and have downstream peers replay
> from it. It's more complex than that, as there's a resync process required
> to recover changes the failed node had sent to other peers but isn't
> available in the WAL archive, but that's the gist.
>
> If you have a 5TB database do you want to run an extra replica or two
> because PostgreSQL can't preserve slots without a running, live replica?
> Your SAN snapshots + WAL archiving have been fine for everything else so
> far.

OK, so what you're basically saying here is that you want to encode
the failover information in the write-ahead log rather than passing it
at the protocol level, so that if you replay the write-ahead log on a
time delay you get the same final state that you would have gotten if
you had replayed it immediately.  I hadn't thought about that
potential advantage, and I can see that it might be an advantage for
some reason, but I don't yet understand what the reason is.  How would
you imagine using any version of this feature in a PITR scenario?  If
you PITR the master back to an earlier point in time, I don't see how
you're going to manage without resyncing the replicas, at which point
you may as well just drop the old slot and create a new one anyway.
Maybe you're thinking of a scenario where we PITR the master and also
use PITR to rewind the replica to a slightly earlier point?  But I
can't quite follow what you're thinking about.  Can you explain
further?

> Requiring live replication connections could also be an issue for service
> interruptions, surely?  Unless you persist needed knowledge in the physical
> replication slot used by the standby to master connection, so the master can
> tell the difference between "downstream went away for while but will come
> back" and "downstream is gone forever, toss out its resources."

I don't think the master needs to retain any resources on behalf of
the failover slot.  If the slot has been updated by feedback from the
associated standby, then the master can toss those resources
immediately.  When the standby comes back on line, it will find out
via a protocol message that it can fast-forward the slot to whatever
the new LSN is, and any WAL files before that point are irrelevant on
both the master and the standby.

> Also, what about cascading? Lots of "pull" model designs I've looked at tend
> to fall down in cascaded environments. For that matter so do failover slots,
> but only for the narrower restriction of not being able to actually decode
> from a failover-enabled slot on a standby, they still work fine in terms of
> cascading down to leaf nodes.

I don't see the problem.  The cascaded standby tells the standby "I'm
interested in the slot called 'craig'" and the standby says "sure,
I'll tell you whenever 'craig' gets updated" but it turns out that
'craig' is actually a failover slot on that standby, so that standby
has said to the master "I'm interested in the slot called 'craig'" and
the master is therefore sending updates to that standby.  Every time
the slot is updated, the master tells the standby and the standby
tells the cascaded standby and, well, that all seems fine.

Also, as Andres pointed out upthread, if the state is passed through
the protocol, you can have a slot on a standby that cascades to a
cascaded standby; if the state is passed through the WAL, all slots
have to cascade from the master.  Generally, with protocol-mediated
failover slots, you can have a different set of slots on every replica
in the cluster and create, drop, and reconfigure them any time you
like.  With WAL-mediated slots, all failover slots must come from the
master and cascade to every standby you've got, which is less
flexible.

I don't want to come on too strong here.  I'm very willing to admit
that you may know a lot more about this than me and I am really
extremely happy to benefit from that accumulated knowledge.  If you're
saying that WAL-mediated slots are a lot better than protocol-mediated
slots, you may well be right, but I don't yet understand the reasons,
and I want to understand the reasons.  I think this stuff is too
important to just have one person saying "here's a patch that does it
this way" and everybody else just says "uh, ok".  Once we adopt some
proposal here we're going to have to continue supporting it forever,
so it seems like we'd better do our best to get it right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Peter Eisentraut

On 8/9/17 08:38, MauMau wrote:
> I tried to find a particular collation name in pg_collation, but I
> cannot understand the naming convention after reading the following
> article.  Specifically, I want to find out whether there is some
> collation equivalent to Japanese_CI_AS in SQL Server, which means
> Japanese, case-insensitive, and accent sensitive.

There are no case-insensitive collations in PostgreSQL (yet).

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-08-09 Thread Thomas Munro

On Thu, Aug 10, 2017 at 1:39 AM, Thomas Munro
 wrote:
> On my computer it took ~1.5 seconds to plan a 1000 partition join,
> ~7.1 seconds to plan a 2000 partition join, and ~50 seconds to plan a
> 4000 partition join.  I poked around in a profiler a bit and saw that
> for the 2000 partition case I spent almost half the time in
> create_plan->...->prepare_sort_from_pathkeys->find_ec_member_for_tle,
> and about half of that was in bms_is_subset.  The other half the time
> was in query_planner->make_one_rel which spent 2/3 of its time in
> set_rel_size->add_child_rel_equivalences->bms_overlap and the other
> 1/3 in standard_join_search.

Ashutosh asked me how I did that.  Please see attached. I was
explaining simple joins like SELECT * FROM foofoo JOIN barbar USING
(a, b).  Here also is the experimental hack I tried when I saw
bitmapset.c eating my CPU.

-- 
Thomas Munro
http://www.enterprisedb.com


bitmapset-track-leading-empty-space.patch
Description: Binary data


make-partitions.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Comment in snapbuild.c file

2017-08-09 Thread Masahiko Sawada

Hi all,

In snapbuild.c file, there is a comment as follows.

   * NB: Because of that xmax can be lower than xmin, because we only
   * increase xmax when a catalog modifying transaction commits. While odd
   * looking, it's correct and actually more efficient this way since we hit
   * fast paths in tqual.c.
   */

Maybe we can get rid of the second "because" in the first sentence?

Attached patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 153d171..55026ed 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1116,10 +1116,10 @@ SnapBuildProcessRunningXacts(SnapBuild *builder, XLogRecPtr lsn, xl_running_xact
 	 * so, because we only need to do it for catalog transactions since we
 	 * only ever look at those.
 	 *
-	 * NB: Because of that xmax can be lower than xmin, because we only
-	 * increase xmax when a catalog modifying transaction commits. While odd
-	 * looking, it's correct and actually more efficient this way since we hit
-	 * fast paths in tqual.c.
+	 * NB: Because of that xmax can be lower than xmin, we only increase xmax
+	 * when a catalog modifying transaction commits. While odd looking, it's
+	 * correct and actually more efficient this way since we hit fast paths
+	 * in tqual.c.
 	 */
 	builder->xmin = running->oldestRunningXid;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] free space % calculation in pgstathashindex

2017-08-09 Thread Noah Misch

On Tue, Aug 08, 2017 at 02:30:51PM +0530, Amit Kapila wrote:
> On Mon, Aug 7, 2017 at 9:38 PM, Ashutosh Sharma  wrote:
> > On Mon, Aug 7, 2017 at 7:19 PM, Amit Kapila  wrote:
> >> On Mon, Aug 7, 2017 at 6:07 PM, Ashutosh Sharma  
> >> wrote:
> >>> Hi,
> >>>
> >> ..
> ..
> >> Why an extra parenthesis in above case whereas not in below case?  I
> >> think the code will look consistent if you follow the same coding
> >> practice.  I suggest don't use it unless you need it.
> >
> > That is because in the 1st case, there are multiple operators (*, +)
> > whereas in the 2nd case we have just one(*). So, just to ensure that
> > '*' is performed before '+',  i had used parenthesis, though it is not
> > required as '*' has higher precedence than '+'. I have removed the
> > extra parenthesis and attached is the new version of patch. Thanks.
> >
> 
> Your latest patch looks good to me.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Walsender timeouts and large transactions

2017-08-09 Thread Craig Ringer

On 9 August 2017 at 21:23, Petr Jelinek 
wrote:

> On 02/08/17 19:35, Yura Sokolov wrote:
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  tested, passed
> > Implements feature:   not tested
> > Spec compliant:   not tested
> > Documentation:not tested
> >
> > There is no check for (last_reply_timestamp <= 0 || wal_sender_timeout
> <= 0) as in other places
> > (in WalSndKeepaliveIfNecessary for example).
> >
> > I don't think moving update of 'now' down to end of loop body is correct:
> > there are calls to ProcessConfigFile with SyncRepInitConfig,
> ProcessRepliesIfAny that can
> > last non-negligible time. It could lead to over sleeping due to larger
> computed sleeptime.
> > Though I could be mistaken.
> >
> > I'm not sure about moving `if (!pg_is_send_pending())` in a body loop
> after WalSndKeepaliveIfNecessary.
> > Is it necessary? But it looks harmless at least.
> >
>
> We also need to do actual timeout handing so that the timeout is not
> deferred to the end of the transaction (Which is why I moved `if
> (!pg_is_send_pending())` under WalSndCheckTimeOut() and
> WalSndKeepaliveIfNecessary() calls).
>
> > Could patch be reduced to check after first `if (!pg_is_sendpending())`
> ? like:
> >
> >   if (!pq_is_send_pending())
> > - return;
> > + {
> > + if (last_reply_timestamp <= 0 || wal_sender_timeout <= 0)
> > + {
> > + CHECK_FOR_INTERRUPTS();
> > + return;
> > + }
> > + if (now <= TimestampTzPlusMilliseconds(last_reply_timestamp,
> wal_sender_timeout / 2))
> > + return;
> > + }
> >
> > If not, what problem prevents?
>
> We should do CHECK_FOR_INTERRUPTS() independently of pq_is_send_pending
> so that it's possible to stop walsender while it's processing large
> transaction.
>
>
Is there any chance of getting this bugfix into Pg 10?

We've just cut back branches, so it'd be a sensible time.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Parallel Append implementation

2017-08-09 Thread Amit Khandekar

On 9 August 2017 at 19:05, Robert Haas  wrote:
> On Wed, Jul 5, 2017 at 7:53 AM, Amit Khandekar  wrote:
>>> This is not applicable on the latest head i.e. commit --
>>> 08aed6604de2e6a9f4d499818d7c641cbf5eb9f7, looks like need a rebasing.
>>
>> Thanks for notifying. Attached is the rebased version of the patch.
>
> This again needs a rebase.

Attached rebased version of the patch. Thanks.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


ParallelAppend_v13_rebased_2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-08-09 Thread Michael Paquier

On Fri, Aug 4, 2017 at 8:19 AM, Masahiko Sawada  wrote:
> On Fri, Jul 28, 2017 at 2:24 PM, Noah Misch  wrote:
>> This item appears under "decisions to recheck mid-beta".  If anyone is going
>> to push for a change here, now is the time.
>
> It has been 1 week since the previous mail. I though that there were
> others argued to change the behavior of old-style setting so that a
> quorum commit is chosen. If nobody is going to push for a change we
> can live with the current behavior?

FWIW, I still see no harm in keeping backward-compatibility here, so I
am in favor of a statu-quo.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Michael Paquier

On Wed, Aug 9, 2017 at 6:38 PM, Robert Haas  wrote:
> The patch doesn't really conform to our coding standards, though, so
> you need to clean it up (or, if you're not sure what you need to do,
> you need to have someone who knows how PostgreSQL code needs to look
> review it for you).

The documentation has a couple of rules for coding conventions:
https://www.postgresql.org/docs/9.6/static/source.html
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-09 Thread Noah Misch

On Mon, Aug 07, 2017 at 06:23:56PM -0400, Tom Lane wrote:
> Peter Eisentraut  writes:
> > On 8/6/17 20:07, Peter Geoghegan wrote:
> >> I've looked into this. I'll give an example of what keyword variants
> >> there are for Greek, and then discuss what I think each is.
> 
> > I'm not sure why we want to get into editorializing this.  We query ICU
> > for the names of distinct collations and use that.  It's more than most
> > people need, sure, but it doesn't cost us anything.
> 
> Yes, *it does*.  The cost will be borne by users who get screwed at update
> time, not by developers, but that doesn't make it insignificant.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small code improvement for btree

2017-08-09 Thread Masahiko Sawada

On Wed, Aug 9, 2017 at 11:23 AM, Masahiko Sawada  wrote:
> On Mon, Aug 7, 2017 at 1:44 PM, Masahiko Sawada  wrote:
>> On Sat, Aug 5, 2017 at 3:29 AM, Peter Geoghegan  wrote:
>>> On Fri, Aug 4, 2017 at 11:12 AM, Alvaro Herrera
>>>  wrote:
 Interesting.  We learned elsewhere that it's better to integrate the
 "!= 0" test as part of the macro definition; so a
 better formulation of this patch would be to change the
 P_INCOMPLETE_SPLIT macro and omit the comparison in the Assert.  (See
 commit 594e61a1de03 for an example).
>>
>> Thank you for the information. The macros other than
>> P_INCOMPLETE_SPLIT in btree.h such as P_ISLEAF, P_ISROOT also doesn't
>> return booleans. Should we deal with them as well?
>>


> -   LockBuffer(hbuffer, BUFFER_LOCK_SHARE);
> +   LockBuffer(hbuffer, BT_READ);
>>>
>>> +1.
>>>
>>> One Linus Torvalds rant that I actually agreed with was a rant against
>>> the use of bool as a type in C code. It's fine, as long as you never
>>> forget that it's actually just another integer.
>>>
 I think BT_READ and BT_WRITE are useless, and I'd rather get rid of
 them ...
>>>
>>> Fair enough, but we should either use them consistently or not at all.
>>> I'm not especially concerned about which, as long as it's one of those
>>> two.
>>>
>>
>> I definitely agreed.
>>
>
> Attached updated patch. I'll add it to next CF.
>

Added to the next CF. Feedback and comment are very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-08-09 Thread Vaishnavi Prabakaran

Andres Freund wrote :
> If you were to send a gigabyte of queries, it'd buffer them all up in
memory... So some
>more intelligence is going to be needed.

Firstly, sorry for the delayed response as I got busy with other
activities.

To buffer up the queries before flushing them to the socket, I think
"conn->outCount>=65536" is ok to use, as 64k is considered to be safe in
Windows as per comments in pqSendSome() API.

Attached the code patch to replace pqFlush calls with pqBatchFlush in the
asynchronous libpq function calls flow.
Still pqFlush is used in "PQbatchSyncQueue" and
"PQexitBatchMode" functions.

Am failing to see the benefit in allowing user to set
PQBatchAutoFlush(true|false) property? Is it really needed?

Thanks & Regards,
Vaishnavi,
Fujitsu Australia.


0001-Pipelining-batch-support-for-libpq-code-v12.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timing-sensitive case in src/test/recovery TAP tests

2017-08-09 Thread Michael Paquier

On Tue, Aug 8, 2017 at 11:33 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> I got the same thought, wondering as well if get_slot_xmins should be
>> renamed check_slot_xmins with the is() tests moved inside it as well.
>> Not sure if that's worth the API ugliness though.
>
> Mmm, doesn't seem like that's worth doing, but I'm half tempted to merge
> wait_slot_xmins into get_slot_xmins so you can't skip it ...

Let's do that please. Merging both was my first feeling when
refactoring this test upthread. Should I send a patch?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Default Partition for Range

2017-08-09 Thread Beena Emerson

Hello Rajkumar,

On Wed, Aug 9, 2017 at 12:37 PM, Rajkumar Raghuwanshi
 wrote:
>
> Hi Beena,
>
> I have applied Jeevan's v24 patches and then your v9 patch over commit
> 5ff3d73813ebcc3ff80be77c30b458d728951036.
> and while testing I got a server crash. below is sql to reproduce it.
>
> postgres=# CREATE TABLE rp (a int, b int) PARTITION by range (a);
> CREATE TABLE
> postgres=# CREATE TABLE rp_p1 PARTITION OF rp DEFAULT partition by range(a);
> CREATE TABLE
> postgres=# CREATE TABLE rp_p11 PARTITION OF rp_p1 FOR VALUES FROM (1) TO
> (15);
> CREATE TABLE
> postgres=# CREATE TABLE rp_p12 PARTITION OF rp_p1 DEFAULT;
> CREATE TABLE
> postgres=# insert into rp select i,i from generate_series(1,15) i;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>

Thank you for testing. It seems I made a mistake in the assert
condition. I have corrected it in this patch.



Thank you,

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


default_range_partition_v10.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Michael Paquier

On Tue, Aug 8, 2017 at 11:35 PM, Alvaro Herrera
 wrote:
> Thread moved to -hackers.
>
> Thomas Munro wrote:
>> On Wed, Aug 9, 2017 at 7:39 AM, Alvaro Herrera  
>> wrote:
>
>> > While at it, fix numerous other problems in the vicinity:
>
>> All of the above seem like good candidates for a checker script in
>> src/tools/check_XXX.pl, a bit like the others I've talked about
>> recently [1][2].
>
> Yeah, that's one idea, but I think it'd be better to have all these
> things be generated content from a single "wait_events.txt" file, like
> src/backend/utils/errcodes.txt which gives birth to a whole host of
> different files.

I would prefer that. Check scripts would tend to be never run. So is
there somebody willing to work on that? Or should I as I am
responsible to increasing the number of wait events?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Default Partition for Range

2017-08-09 Thread Rajkumar Raghuwanshi

On Wed, Aug 9, 2017 at 8:26 AM, Beena Emerson 
wrote:

> I have updated the patch to make it similar to the way default/null is
> handled in list partition, removing the PARTITION_RANGE_DATUM_DEFAULT.
> This is to be applied over v24 patches shared by Jeevan [1] which
> applies on commit id 5ff3d73813ebcc3ff80be77c30b458d728951036.
>
> The RelationBuildPartitionDesc has been modified a lot, especially the
> way all_bounds, ndatums and rbounds are set.
>
> [1] https://www.postgresql.org/message-id/CAOgcT0OVwDu%
> 2BbeChWb5R5s6rfKLCiWcZT5617hqu7T3GdA1hAw%40mail.gmail.com
>
>
Hi Beena,

I have applied Jeevan's v24 patches and then your v9 patch over commit
5ff3d73813ebcc3ff80be77c30b458d728951036.
and while testing I got a server crash. below is sql to reproduce it.

postgres=# CREATE TABLE rp (a int, b int) PARTITION by range (a);
CREATE TABLE
postgres=# CREATE TABLE rp_p1 PARTITION OF rp DEFAULT partition by range(a);
CREATE TABLE
postgres=# CREATE TABLE rp_p11 PARTITION OF rp_p1 FOR VALUES FROM (1) TO
(15);
CREATE TABLE
postgres=# CREATE TABLE rp_p12 PARTITION OF rp_p1 DEFAULT;
CREATE TABLE
postgres=# insert into rp select i,i from generate_series(1,15) i;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Thomas Munro

On Wed, Aug 9, 2017 at 6:42 PM, Michael Paquier
 wrote:
> On Tue, Aug 8, 2017 at 11:35 PM, Alvaro Herrera
>  wrote:
>>> All of the above seem like good candidates for a checker script in
>>> src/tools/check_XXX.pl, a bit like the others I've talked about
>>> recently [1][2].
>>
>> Yeah, that's one idea, but I think it'd be better to have all these
>> things be generated content from a single "wait_events.txt" file, like
>> src/backend/utils/errcodes.txt which gives birth to a whole host of
>> different files.
>
> I would prefer that. Check scripts would tend to be never run. So is
> there somebody willing to work on that? Or should I as I am
> responsible to increasing the number of wait events?

Yeah, that may well be a better idea in this case.  On the other hand
it wouldn't catch multiple users of the same wait event, so both
things could be useful.

As for whether hypothetical check scripts would ever be run, I was
thinking we should stick them under some make target that developers
run all the time anyway -- perhaps "check".  Shouldn't we catch simple
mechanically detectable problems as early in the pipeline as possible?

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Page Scan Mode in Hash Index

2017-08-09 Thread Amit Kapila

On Mon, Aug 7, 2017 at 5:50 PM, Ashutosh Sharma  wrote:
> On Fri, Aug 4, 2017 at 7:14 PM, Amit Kapila  wrote:
>> On Sun, Jul 30, 2017 at 2:07 PM, Ashutosh Sharma  
>> wrote:
>>> Hi,
>>>
>>> On Wed, May 10, 2017 at 2:28 PM, Ashutosh Sharma  
>>> wrote:
 While doing the code coverage testing of v7 patch shared with - [1], I
 found that there are few lines of code in _hash_next() that are
 redundant and needs to be removed. I basically came to know this while
 testing the scenario where a hash index scan starts when a split is in
 progress. I have removed those lines and attached is the newer version
 of patch.

>>>
>>> Please find the new version of patches for page at a time scan in hash
>>> index rebased on top of latest commit in master branch. Also, i have
>>> ran pgindent script with pg_bsd_indent version 2.0 on all the modified
>>> files. Thanks.
>>>
>>
>> Few comments:
>
> Thanks for reviewing the patch.
>

Comments on the latest patch:

1.
/*
+ * We remember prev and next block number along with current block
+ * number so that if fetching the tuples using cursor we know the
+ * page from where to begin. This is for the case where we have
+ * reached the end of bucket chain without finding any matching
+ * tuples.
+ */
+ if (!BlockNumberIsValid(opaque->hasho_nextblkno))
+ prev_blkno = opaque->hasho_prevblkno;

This doesn't seem to be the right place for this comment as you are
not saving next or current block number here.  I am not sure whether
you really need this comment at all.  Will it be better if you move
this to else part of the loop and reword it as:

"Remember next and previous block numbers for scrollable cursors to
know the start position."

2. The code in _hash_readpage looks quite bloated.  I think we can
make it better if we do something like below.
a.
+_hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
{
..
+ if (so->currPos.buf == so->hashso_bucket_buf ||
+ so->currPos.buf == so->hashso_split_bucket_buf)
+ {
+ so->currPos.prevPage = InvalidBlockNumber;
+ so->currPos.nextPage = opaque->hasho_nextblkno;
+ LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
+ }
+ else
+ {
+ so->currPos.prevPage = opaque->hasho_prevblkno;
+ so->currPos.nextPage = opaque->hasho_nextblkno;
+ _hash_relbuf(rel, so->currPos.buf);
+ so->currPos.buf = InvalidBuffer;
+ }
..
}

This code chunk is same for both forward and backward scans.  I think
you can have single copy of this code by moving it out of if-else
loop.

b.
+_hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
{
..
+ /* new page, locate starting position by binary search */
+ offnum = _hash_binsearch(page, so->hashso_sk_hash);
+
+ itemIndex = _hash_load_qualified_items(scan, page, offnum, dir);
+
+ while (itemIndex == 0)
+ {
+ /*
+ * Could not find any matching tuples in the current page, move to
+ * the next page. Before leaving the current page, also deal with
+ * any killed items.
+ */
+ if (so->numKilled > 0)
+ _hash_kill_items(scan);
+
+ /*
+ * We remember prev and next block number along with current block
+ * number so that if fetching the tuples using cursor we know the
+ * page from where to begin. This is for the case where we have
+ * reached the end of bucket chain without finding any matching
+ * tuples.
+ */
+ if (!BlockNumberIsValid(opaque->hasho_nextblkno))
+ prev_blkno = opaque->hasho_prevblkno;
+
+ _hash_readnext(scan, , , );
+ if (BufferIsValid(buf))
+ {
+ so->currPos.buf = buf;
+ so->currPos.currPage = BufferGetBlockNumber(buf);
+ so->currPos.lsn = PageGetLSN(page);
+ offnum = _hash_binsearch(page, so->hashso_sk_hash);
+ itemIndex = _hash_load_qualified_items(scan, page,
+   offnum, dir);
..
}

Have just one copy of code search the offset and load qualified items.
Change the above while loop to do..while loop and have a check in
between break the loop when item index is not zero.

3.
- * Find the first item in the index that
- * satisfies the qualification associated with the scan descriptor. On
- * success, the page containing the current index tuple is read locked
- * and pinned, and the scan's opaque data entry is updated to
- * include the buffer.
+ * We find the first item (or, if backward scan, the last item) in
+ * the index that satisfies the qualification associated with the
+ * scan descriptor. On success, the page containing the current
+ * index tuple is read locked and pinned, and data about the
+ * matching tuple(s) on the page has been loaded into so->currPos,
+ * scan->xs_ctup.t_self is set to the heap TID of the current tuple.
+ *
+ * If there are no matching items in the index, we return FALSE,
+ * with no pins or locks held.
  */
 bool
 _hash_first(IndexScanDesc scan, ScanDirection dir)
@@ -229,15 +297,9 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)

I don't think on success, the lock or pin is held anymore, after this
patch the only pin will be retained

Re: [HACKERS] parallelize queries containing initplans

2017-08-09 Thread Kuntal Ghosh

On Wed, Aug 9, 2017 at 10:24 AM, Haribabu Kommi
 wrote:
>
> I tested the latest patch and the parallel plan is getting choose for most
> of
> the init plans.
>
Thanks for testing.

> For the following query the parallel plan is not chosen. The query contains
> an init plan that refer the outer node.
>
> postgres=# explain analyze select * from t1 where t1.i in (select t2.i from
> t2 where t1.k = (select max(k) from t3 where t3.i=t1.i));
>  QUERY PLAN
> -
>  Seq Scan on t1  (cost=0.00..22426.28 rows=448 width=12) (actual
> time=8.335..132.557 rows=2 loops=1)
>Filter: (SubPlan 2)
>Rows Removed by Filter: 894
>SubPlan 2
>  ->  Result  (cost=16.27..31.26 rows=999 width=4) (actual
> time=0.146..0.146 rows=0 loops=896)
>One-Time Filter: (t1.k = $1)
>InitPlan 1 (returns $1)
>  ->  Aggregate  (cost=16.25..16.27 rows=1 width=4) (actual
> time=0.145..0.145 rows=1 loops=896)
>->  Seq Scan on t3  (cost=0.00..16.25 rows=2 width=4)
> (actual time=0.131..0.144 rows=0 loops=896)
>  Filter: (i = t1.i)
>  Rows Removed by Filter: 900
>->  Seq Scan on t2  (cost=16.27..31.26 rows=999 width=4) (actual
> time=0.012..0.013 rows=10 loops=2)
>  Planning time: 0.272 ms
>  Execution time: 132.623 ms
> (14 rows)
>
An observation is that the filter at Result node can't be pushed down
to the sequential scan on t2 because the filter is on t1. So, it has
to scan the complete t2 relation and send all the tuple to upper node,
a worst case for parallelism. Probably, this is the reason the
optimizer doesn't pick parallel plan for the above case.

Just for clarification, do you see any changes in the plan after
forcing parallelism(parallel_tuple_cost, parallel_setup_cost,
min_parallel_table_scan_size=0)?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallelize queries containing initplans

2017-08-09 Thread Amit Kapila

On Wed, Aug 9, 2017 at 10:24 AM, Haribabu Kommi
 wrote:
>
>
> On Mon, Jul 17, 2017 at 10:53 PM, Amit Kapila 
> wrote:
>>
>> On Tue, Mar 28, 2017 at 7:25 AM, Amit Kapila 
>> wrote:
>> > On Thu, Mar 16, 2017 at 2:34 AM, Kuntal Ghosh
>> >  wrote:
>> >> On Tue, Mar 14, 2017 at 3:20 PM, Amit Kapila 
>> >> wrote:
>> >>> Based on that idea, I have modified the patch such that it will
>> >>> compute the set of initplans Params that are required below gather
>> >>> node and store them as bitmap of initplan params at gather node.
>> >>> During set_plan_references, we can find the intersection of external
>> >>> parameters that are required at Gather or nodes below it with the
>> >>> initplans that are passed from same or above query level. Once the set
>> >>> of initplan params are established, we evaluate those (if they are not
>> >>> already evaluated) before execution of gather node and then pass the
>> >>> computed value to each of the workers.   To identify whether a
>> >>> particular param is parallel safe or not, we check if the paramid of
>> >>> the param exists in initplans at same or above query level.  We don't
>> >>> allow to generate gather path if there are initplans at some query
>> >>> level below the current query level as those plans could be
>> >>> parallel-unsafe or undirect correlated plans.
>> >>
>> >> I would like to mention different test scenarios with InitPlans that
>> >> we've considered while developing and testing of the patch.
>> >>
>> >
>> > Thanks a lot Kuntal for sharing different test scenarios.
>> > Unfortunately, this patch doesn't received any review till now, so
>> > there is no chance of making it in to PostgreSQL-10.  I have moved
>> > this to next CF.
>> >
>>
>> Attached is a rebased version of the patch with below changes:
>> a. SubplanState now directly stores Subplan rather than ExprState, so
>> patch needs some adjustment in that regard.
>> b. While rejecting the paths (based on if there are initplans at level
>> below the current query level) for parallelism, the rejected paths
>> were not marked as parallel unsafe.  Due to this in
>> force_parallel_mode=regress, we were able to add gather node above
>> parallel unsafe paths.  The modified patch ensures to mark such paths
>> as parallel unsafe.
>> c. Added regression test.
>> d. Improve comments in the code.
>>
>
> I tested the latest patch and the parallel plan is getting choose for most
> of
> the init plans.
>

Thanks for looking into this patch.

> For the following query the parallel plan is not chosen. The query contains
> an init plan that refer the outer node.
>

We don't want to generate the parallel plan for such cases.  Whenever
initplan refers to any outer node (aka correlated plan), it won't
generate a parallel plan.  Also, for t2, it doesn't choose a parallel
plan because one-time filter refers to the outer node (again
correlated plan case). Basically, till now we don't support parallel
plan for any case where the correlated plan is used.  So, it is
perfectly valid that it doesn't use parallel plan here.

> postgres=# explain analyze select * from t1 where t1.i in (select t2.i from
> t2 where t1.k = (select max(k) from t3 where t3.i=t1.i));
>  QUERY PLAN
> -
>  Seq Scan on t1  (cost=0.00..22426.28 rows=448 width=12) (actual
> time=8.335..132.557 rows=2 loops=1)
>Filter: (SubPlan 2)
>Rows Removed by Filter: 894
>SubPlan 2
>  ->  Result  (cost=16.27..31.26 rows=999 width=4) (actual
> time=0.146..0.146 rows=0 loops=896)
>One-Time Filter: (t1.k = $1)
>InitPlan 1 (returns $1)
>  ->  Aggregate  (cost=16.25..16.27 rows=1 width=4) (actual
> time=0.145..0.145 rows=1 loops=896)
>->  Seq Scan on t3  (cost=0.00..16.25 rows=2 width=4)
> (actual time=0.131..0.144 rows=0 loops=896)
>  Filter: (i = t1.i)
>  Rows Removed by Filter: 900
>->  Seq Scan on t2  (cost=16.27..31.26 rows=999 width=4) (actual
> time=0.012..0.013 rows=10 loops=2)
>  Planning time: 0.272 ms
>  Execution time: 132.623 ms
> (14 rows)
>
> If I change the query a little bit, the Result node doesn't appear and the
> parallel plan
> gets chosen.
>

This is a valid case for choosing a parallel plan.

> postgres=# explain analyze select * from t1 where t1.i in (select t2.i from
> t2 where t2.k = (select max(k) from t3 where t3.i=t1.i));
>QUERY PLAN
> -
>  Seq Scan on t1  (cost=0.00..19162.88 rows=448 width=12) (actual
> time=3501.483..3501.483 rows=0 loops=1)
>

Re: [HACKERS] dubious error message from partition.c

2017-08-09 Thread Robert Haas

On Tue, Aug 8, 2017 at 11:34 PM, Tom Lane  wrote:
> A small suggestion is that it'd be better to write it like "Specified
> upper bound \"%s\" precedes lower bound \"%s\"."  I think "succeeds" has
> more alternate meanings than "precedes", so the wording you have seems
> more confusing than it needs to be.  (Of course, the situation could be
> the opposite in other languages, but translators have the ability to
> reverse the ordering if they need to.)

I think that doesn't quite work, because the failure is caused by LB
<= UB, not LB < UB.  We could fix that by writing "precedes or equals"
but that seems lame.  Maybe:

Lower bound %s does not precede upper bound %s.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-08-09 Thread Robert Haas

On Tue, Aug 8, 2017 at 7:33 PM, Dean Rasheed  wrote:
> Well perhaps verbosity-reduction isn't sufficient justification but I
> still think this is correct because logically any values in the bound
> after MINVALUE/MAXVALUE are irrelevant, so it seems overly restrictive
> to force all later values to be MINVALUE/MAXVALUE when the code will
> just ignore those values.

I just don't understand why you think there should be multiple
spellings of the same bound.  Generally, canonicalization is good.
One of my fears here is that at least some people will get confused
and think a bound from (minvalue, 0) to (maxvalue, 10) allows any
value for the first column and a 0-9 value for the second column which
is wrong.

My other fear is that, since you've not only allowed this into the
syntax but into the catalog, this will become a source of bugs for
years to come.  Every future patch that deals with partition bounds
will now have to worry about testing
unbounded-followed-by-non-unbounded.  If we're not going to put back
those error checks completely - and I think we should - we should at
least canonicalize what actually gets stored.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Default Partition for Range

2017-08-09 Thread Rajkumar Raghuwanshi

On Wed, Aug 9, 2017 at 1:54 PM, Beena Emerson 
wrote:

> Hello Rajkumar,
>
> On Wed, Aug 9, 2017 at 12:37 PM, Rajkumar Raghuwanshi
>  wrote:
> >
> > Hi Beena,
> >
> > I have applied Jeevan's v24 patches and then your v9 patch over commit
> > 5ff3d73813ebcc3ff80be77c30b458d728951036.
> > and while testing I got a server crash. below is sql to reproduce it.
> >
> > postgres=# CREATE TABLE rp (a int, b int) PARTITION by range (a);
> > CREATE TABLE
> > postgres=# CREATE TABLE rp_p1 PARTITION OF rp DEFAULT partition by
> range(a);
> > CREATE TABLE
> > postgres=# CREATE TABLE rp_p11 PARTITION OF rp_p1 FOR VALUES FROM (1) TO
> > (15);
> > CREATE TABLE
> > postgres=# CREATE TABLE rp_p12 PARTITION OF rp_p1 DEFAULT;
> > CREATE TABLE
> > postgres=# insert into rp select i,i from generate_series(1,15) i;
> > server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> >
>
> Thank you for testing. It seems I made a mistake in the assert
> condition. I have corrected it in this patch.
>
> Thanks Beena, I have tested new patch, crash got fixed now, got two
observations, please check if these are expected?

--difference in the description of default partition in case of list vs
range

create table lp (a int) partition by list(a);
create table lp_d partition of lp DEFAULT;
postgres=# \d+ lp_d
   Table "public.lp_d"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target
| Description
+-+---+--+-+-+--+-
 a  | integer |   |  | | plain   |
|
Partition of: lp DEFAULT

*Partition constraint:*
create table rp (a int) partition by range(a);
create table rp_d partition of rp DEFAULT;
postgres=# \d+ rp_d
   Table "public.rp_d"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target
| Description
+-+---+--+-+-+--+-
 a  | integer |   |  | | plain   |
|
Partition of: rp DEFAULT

*Partition constraint: true*
--getting warning WARNING:  skipped scanning foreign table...which is a
partition of default partition...
--when adding new partition after adding default as foreign partition

CREATE EXTENSION postgres_fdw;
CREATE SERVER def_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname
'postgres', port '5432', use_remote_estimate 'true');
CREATE USER MAPPING FOR PUBLIC SERVER def_server;

postgres=# CREATE TABLE frp (a int, b int) PARTITION BY RANGE(a);
CREATE TABLE
postgres=# CREATE TABLE frp_d PARTITION OF frp DEFAULT partition by
RANGE(b);
CREATE TABLE
postgres=# CREATE TABLE ffrp_d_d (like frp);
CREATE TABLE
postgres=# CREATE FOREIGN TABLE ftfrp_d_d PARTITION OF frp_d DEFAULT SERVER
def_server OPTIONS (TABLE_NAME 'ffrp_d_d');
CREATE FOREIGN TABLE
postgres=# CREATE TABLE frp_p2 PARTITION OF frp FOR VALUES FROM (10) TO
(12);
WARNING:  skipped scanning foreign table "ftfrp_d_d" which is a partition
of default partition "frp_d"
CREATE TABLE

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Re: [HACKERS] Default Partition for Range

2017-08-09 Thread Robert Haas

On Wed, Aug 9, 2017 at 8:18 AM, Rajkumar Raghuwanshi
 wrote:
> --difference in the description of default partition in case of list vs
> range
>
> create table lp (a int) partition by list(a);
> create table lp_d partition of lp DEFAULT;
> postgres=# \d+ lp_d
>Table "public.lp_d"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats target
> | Description
> +-+---+--+-+-+--+-
>  a  | integer |   |  | | plain   |
> |
> Partition of: lp DEFAULT
> Partition constraint:
>
> create table rp (a int) partition by range(a);
> create table rp_d partition of rp DEFAULT;
> postgres=# \d+ rp_d
>Table "public.rp_d"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats target
> | Description
> +-+---+--+-+-+--+-
>  a  | integer |   |  | | plain   |
> |
> Partition of: rp DEFAULT
> Partition constraint: true

This looks like a problem with the default list partitioning patch.  I
think "true" is what we expect to see here in both cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread MauMau

Hello,

I tried to find a particular collation name in pg_collation, but I
cannot understand the naming convention after reading the following
article.  Specifically, I want to find out whether there is some
collation equivalent to Japanese_CI_AS in SQL Server, which means
Japanese, case-insensitive, and accent sensitive.  Could you tell me
how to do this?  Is there any room to improve the PG manual?


https://www.postgresql.org/docs/devel/static/collation.html
--
23.2.2.2.2. ICU collations

With ICU, it is not sensible to enumerate all possible locale names.
ICU uses a particular naming system for locales, but there are many
more ways to name
a locale than there are actually distinct locales. (In fact, any
string will be accepted as a locale name.) See
http://userguide.icu-project.org/locale
for information on ICU locale naming.
--


Regards
MauMau



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible issue with expanded object infrastructure on Postgres 9.6.1

2017-08-09 Thread Justin Workman

On Tue, Aug 8, 2017 at 8:17 PM, Tom Lane  wrote:

> Peter Geoghegan  writes:
> > On Thu, Jan 19, 2017 at 5:45 PM, Peter Geoghegan  wrote:
> >> A customer is on 9.6.1, and complains of a segfault observed at least
> >> 3 times.
> > ...
> > For the sake of the archives: this now looks very much like the issue
> > that Tom just fixed with commit
> > 9bf4068cc321a4d44ac54089ab651a49d89bb567.
>
> Yeah, particularly seeing that $customer noted that some of the
> columns involved were UUIDs:
>
> https://www.postgresql.org/message-id/CAOxz3fqK9Y0GntL8MDoeZ
> jy2ot_6lx1yvhay6bd1vykup-i...@mail.gmail.com
>
> Good to have gotten to the bottom of that one.  Too bad it just
> missed the train for 9.6.4.
>
> regards, tom lane
>

 $customer, here. I just want to thank everyone involved for getting to the
bottom of this and for your support. Even if it missed the 9.6.4 release,
I'm very grateful for your help. We haven't had much of an issue since
disabling parallel workers so nothing is harmed by waiting a little longer.

Thanks,
Justin

Re: [HACKERS] Cache lookup errors with functions manipulation object addresses

2017-08-09 Thread Aleksander Alekseev

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

I believe this patch is "Ready for Committer".

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Alvaro Herrera

Tom Lane wrote:

> I think generating whatever we can from a single authoritative file
> is indeed a good idea.

Yay.

> But I had the impression that people also wanted to enforce a rule
> about "only one use of each wait event name", which'd require a
> checker script, no?  (I'm not really convinced that we need such a
> rule, fwiw.)

I'm not convinced of that, either.  Of the possible problems in the
area, that seems the lesser one.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] why not parallel seq scan for slow functions

2017-08-09 Thread Robert Haas

On Tue, Aug 8, 2017 at 3:50 AM, Amit Kapila  wrote:
> Right.
>
> I see two ways to include the cost of the target list for parallel
> paths before rejecting them (a) Don't reject parallel paths
> (Gather/GatherMerge) during add_path.  This has the danger of path
> explosion. (b)  In the case of parallel paths, somehow try to identify
> that path has a costly target list (maybe just check if the target
> list has anything other than vars) and use it as a heuristic to decide
> that whether a parallel path can be retained.

I think the right approach to this problem is to get the cost of the
GatherPath correct when it's initially created.  The proposed patch
changes the cost after-the-fact, but that (1) doesn't prevent a
promising path from being rejected before we reach this point and (2)
is probably unsafe, because it might confuse code that reaches the
modified-in-place path through some other pointer (e.g. code which
expects the RelOptInfo's paths to still be sorted by cost).  Perhaps
the way to do that is to skip generate_gather_paths() for the toplevel
scan/join node and do something similar later, after we know what
target list we want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] More fun with container types

2017-08-09 Thread Tom Lane

While poking at the arrays-of-domains TODO item, I found yet another area
in which the existing code fails to handle existing supported cases.
Specifically, although DefineRange will happily take domains or composites
as a range subtype, and we already noticed that domains-over-array-of-
composite are supported, find_composite_type_dependencies is innocent of
the idea that the target type might be embedded in anything but an array
or composite type.  The test cases in the attached patch show cases where
HEAD either fails to notice stored data violating a proposed new domain
constraint, or hits assertion failures or core dumps.

The patch addresses this by generalizing the existing special case for
array-over-composite so that any type that's directly dependent on the
target type (which could be its array type, or a domain or range over
it) is treated as a container type and recursively scanned for.

I believe this coding will work without any further changes for the
case of arrays over domains, but I've not fully tested that yet.

I intend to apply and back-patch this shortly, unless anyone has a
better idea about how to handle such cases.

regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1b8d4b3..2afde0a 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
*** ATTypedTableRecursion(List **wqueue, Rel
*** 4849,4861 
  /*
   * find_composite_type_dependencies
   *
!  * Check to see if a composite type is being used as a column in some
!  * other table (possibly nested several levels deep in composite types!).
   * Eventually, we'd like to propagate the check or rewrite operation
!  * into other such tables, but for now, just error out if we find any.
   *
!  * Caller should provide either a table name or a type name (not both) to
!  * report in the error message, if any.
   *
   * We assume that functions and views depending on the type are not reasons
   * to reject the ALTER.  (How safe is this really?)
--- 4849,4866 
  /*
   * find_composite_type_dependencies
   *
!  * Check to see if the type "typeOid" is being used as a column in some table
!  * (possibly nested several levels deep in composite types, arrays, etc!).
   * Eventually, we'd like to propagate the check or rewrite operation
!  * into such tables, but for now, just error out if we find any.
   *
!  * Caller should provide either the associated relation of a rowtype,
!  * or a type name (not both) for use in the error message, if any.
!  *
!  * Note that "typeOid" is not necessarily a composite type; it could also be
!  * another container type such as an array or range, or a domain over one of
!  * these things.  The name of this function is therefore somewhat historical,
!  * but it's not worth changing.
   *
   * We assume that functions and views depending on the type are not reasons
   * to reject the ALTER.  (How safe is this really?)
*** find_composite_type_dependencies(Oid typ
*** 4868,4878 
  	ScanKeyData key[2];
  	SysScanDesc depScan;
  	HeapTuple	depTup;
! 	Oid			arrayOid;
  
  	/*
! 	 * We scan pg_depend to find those things that depend on the rowtype. (We
! 	 * assume we can ignore refobjsubid for a rowtype.)
  	 */
  	depRel = heap_open(DependRelationId, AccessShareLock);
  
--- 4873,4885 
  	ScanKeyData key[2];
  	SysScanDesc depScan;
  	HeapTuple	depTup;
! 
! 	/* since this function recurses, it could be driven to stack overflow */
! 	check_stack_depth();
  
  	/*
! 	 * We scan pg_depend to find those things that depend on the given type.
! 	 * (We assume we can ignore refobjsubid for a type.)
  	 */
  	depRel = heap_open(DependRelationId, AccessShareLock);
  
*** find_composite_type_dependencies(Oid typ
*** 4894,4901 
  		Relation	rel;
  		Form_pg_attribute att;
  
! 		/* Ignore dependees that aren't user columns of relations */
! 		/* (we assume system columns are never of rowtypes) */
  		if (pg_depend->classid != RelationRelationId ||
  			pg_depend->objsubid <= 0)
  			continue;
--- 4901,4922 
  		Relation	rel;
  		Form_pg_attribute att;
  
! 		/* Check for directly dependent types */
! 		if (pg_depend->classid == TypeRelationId)
! 		{
! 			/*
! 			 * This must be an array, domain, or range containing the given
! 			 * type, so recursively check for uses of this type.  Note that
! 			 * any error message will mention the original type not the
! 			 * container; this is intentional.
! 			 */
! 			find_composite_type_dependencies(pg_depend->objid,
! 			 origRelation, origTypeName);
! 			continue;
! 		}
! 
! 		/* Else, ignore dependees that aren't user columns of relations */
! 		/* (we assume system columns are never of interesting types) */
  		if (pg_depend->classid != RelationRelationId ||
  			pg_depend->objsubid <= 0)
  			continue;
*** find_composite_type_dependencies(Oid typ
*** 4952,4965 
  	systable_endscan(depScan);

Re: [HACKERS] Re: [GSOC][weekly report 9] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-08-09 Thread Robert Haas

On Mon, Aug 7, 2017 at 1:51 PM, Alvaro Herrera  wrote:
> * the whole predicate.c stuff is written using SHM_QUEUE.  I suppose
>   SHM_QUEUE works just fine, but predicate.c was being written at about
>   the same time (or a bit earlier) than the newer ilist.c interface was
>   being created, which I think had more optimization work thrown in.
>   Maybe it would be good for predicate.c to ditch use of SHM_QUEUE and
>   use ilist.c interfaces instead?  We could even think about being less
>   strict about holding exclusive lock on SerializableFinished for the
>   duration of ClearOldPredicateLocks, i.e. use only a share lock and
>   only exchange for exclusive if a list modification is needed.

I think we should rip SHM_QUEUE out completely and get rid of it.  It
doesn't make sense to have two implementations, one of which by its
name is only for use in shared memory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/perl extension fails on Windows

2017-08-09 Thread Robert Haas

On Tue, Aug 8, 2017 at 12:15 PM, Sandeep Thakkar
 wrote:
> I copied and pasted that portion of the build log into file build.log
> (attached) for Windows 32bit and Windows 64bit.

Can you also share the output of 'perl -V' on each system?

Comparing the 32-bit log and the 64-bit log, I see the following differences:

32-bit has extra options /IC:\pgBuild32\uuid\include /Oy- /analyze- /D
_USE_32BIT_TIME_T
64-bit has extra options /D WIN64 /D CONSERVATIVE /D USE_SITECUSTOMIZE
Both builds have several copies of /IC:\pgBuild32\include or
/IC:\pgBuild64\include, but the 64-bit build has an extra one

(I wrote that command on Linux, might need different quoting to work
on Windows.)

I'm suspicious about _USE_32BIT_TIME_T.  The contents of a
PerlInterpreter are defined in Perl's intrpvar.h, and one of those
lines is:

PERLVAR(I, basetime,Time_t) /* $^T */

Now, Time_t is defined as time_t elsewhere in the Perl headers, so if
the plperl build and the Perl interpreter build had different ideas
about whether that flag was set, the interpreter sizes would be
different.  Unfortunately for this theory, if I'm interpreting the
screenshot you posted correctly, the sizes are different by exactly 16
bytes, and I can't see how a difference in the size of time_t could
account for more than 8 bytes (4 bytes of actual size change + 4 bytes
of padding).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dubious error message from partition.c

2017-08-09 Thread Tom Lane

Alvaro Herrera  writes:
> Dean Rasheed wrote:
>> There was an earlier suggestion to use "greater than or equal to". I
>> think that would work quite well:

> Is it possible to detect the equality case specifically and use a
> different errdetail?  Something like "the lower bound %s is equal to the
> upper bound" (obviously without including both in the message.)

That seems like overkill.  I'm good with "greater than or equal to".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dubious error message from partition.c

2017-08-09 Thread Alvaro Herrera

Dean Rasheed wrote:
> On 9 August 2017 at 13:03, Robert Haas  wrote:
> > On Tue, Aug 8, 2017 at 11:34 PM, Tom Lane  wrote:
> >> A small suggestion is that it'd be better to write it like "Specified
> >> upper bound \"%s\" precedes lower bound \"%s\"."  I think "succeeds" has
> >> more alternate meanings than "precedes", so the wording you have seems
> >> more confusing than it needs to be.  (Of course, the situation could be
> >> the opposite in other languages, but translators have the ability to
> >> reverse the ordering if they need to.)
> >
> > I think that doesn't quite work, because the failure is caused by LB
> > <= UB, not LB < UB.  We could fix that by writing "precedes or equals"
> > but that seems lame.  Maybe:
> >
> > Lower bound %s does not precede upper bound %s.
> 
> There was an earlier suggestion to use "greater than or equal to". I
> think that would work quite well:
> 
> ERROR:  invalid range bounds for partition \"%s\"
> DETAIL:  lower bound %s is greater than or equal to upper bound %s.

Is it possible to detect the equality case specifically and use a
different errdetail?  Something like "the lower bound %s is equal to the
upper bound" (obviously without including both in the message.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Peter Geoghegan

On Wed, Aug 9, 2017 at 5:38 AM, MauMau  wrote:
> I tried to find a particular collation name in pg_collation, but I
> cannot understand the naming convention after reading the following
> article.  Specifically, I want to find out whether there is some
> collation equivalent to Japanese_CI_AS in SQL Server, which means
> Japanese, case-insensitive, and accent sensitive.  Could you tell me
> how to do this?  Is there any room to improve the PG manual?

This is not an answer to the question you asked, but it may interest
you to know that ICU uses JIS X 4061 for Japanese, unlike Glibc. This
will give more useful results when sorting Japanese.

The best explanation of the difference that I can understand is here,
under "Why do CJK strings sort incorrectly in Unicode?":

https://dev.mysql.com/doc/refman/5.5/en/faqs-cjk.html

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Alvaro Herrera

This thread is surprising.  If we generate the few lines of code being
in trouble, we don't need any checker script, so I don't see why we'd go
the route of the checker script instead.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2017-08-09 Thread Shubham Barai

Hi,

Please find the updated patch for predicate locking in gin index here.

There was a small issue in the previous patch. I didn't consider the case
where only root page exists in the tree, and there is a predicate lock on
it,
and it gets split.

If we treat the original page as a left page and create a new root and right
page, then we just need to copy a predicate lock from the left to the right
page (this is the case in B-tree).

But if we treat the original page as a root and create a new left and right
page, then we need to copy a predicate lock on both new pages (in the case
of rum and gin).

link to updated code and tests:
https://github.com/shubhambaraiss/postgres/commit/6172639a104785f051cb4aa0d511c58f2bae65a6

Regards,
Shubham

 Sent with Mailtrack

On 17 July 2017 at 19:08, Shubham Barai  wrote:

> Hi,
>
> I am attaching a patch for predicate locking in gin index.
>
>
> Regards,
> Shubham
>
>
>
>  Sent with Mailtrack
> 
>
> On 11 July 2017 at 19:10, Shubham Barai  wrote:
>
>> Project: Explicitly support predicate locks in index AMs besides b-tree
>>
>>
>> I have done following tasks during this week.
>>
>> 1) worked on how to detect rw conflicts when fast update is enabled
>>
>> 2) created tests for different gin operators
>>
>> 3) went through some patches on commitfest to review
>>
>> 4) solved some issues that came up while testing
>>
>> link to the code: https://github.com/shubhambaraiss/postgres/commit/1365
>> d75db36a4e398406dd266c3d4fe8e1ec30ff
>>
>>  Sent with Mailtrack
>> 
>>
>
>

Predicate-locking-in-gin-index.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Robert Haas

On Tue, Aug 1, 2017 at 4:00 PM, Ildus K  wrote:
> It's a workaround. DatumGetTSVector and
> DatumGetTSVectorCopy will upgrade tsvector on the fly if it
> has old format.

Hmm, that seems like a real fix, not just a workaround.  If you can
transparently read the old format, there's no problem.  Not sure about
performance, though.

The patch doesn't really conform to our coding standards, though, so
you need to clean it up (or, if you're not sure what you need to do,
you need to have someone who knows how PostgreSQL code needs to look
review it for you).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Tom Lane

Alvaro Herrera  writes:
> This thread is surprising.  If we generate the few lines of code being
> in trouble, we don't need any checker script, so I don't see why we'd go
> the route of the checker script instead.

I think generating whatever we can from a single authoritative file
is indeed a good idea.  But I had the impression that people also wanted
to enforce a rule about "only one use of each wait event name", which'd
require a checker script, no?  (I'm not really convinced that we need
such a rule, fwiw.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities

2017-08-09 Thread Remi Colinet

2017-08-01 19:57 GMT+02:00 Robert Haas :

> On Tue, Aug 1, 2017 at 12:35 PM, Remi Colinet 
> wrote:
> > I did it in version 2 of the patch.
> > I'am skeptical about the use of JSON, XML, and others in such output.
> >
> > Does anyone use these formats (XML, JSON, YAML) for EXPLAIN output?
> > I suspect only TEXT format is being used.
>
> Do you have any reason to suspect that others aren't being used?  The
> default format for anything is likely to be the most commonly-used
> one, but I don't think that proves the others are unused.
>
> Even if it were true, it wouldn't be a good justification for
> inventing an entirely new machine-readable format, at least not IMHO.
>

In version 3, my idea was to use a similar output as the one used for
Ora..e database with the v$session_longops dynamic table.
May be this is not such a good idea then. Though, it seems very handy at
1st sight.

I can revert to TEXT, JSON, XML and YAML. I will need to modify EXPLAIN
code in order to share some common parts for output formatting. Basically,
this would not change the code of EXPLAIN unless than moving some functions
in a pg_report.c file and with function names starting by ReportXXX instead
of ExplainXXX . Duplicating code for such output is not an option.

Rgds

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

[HACKERS] What users can do with custom ICU collations in Postgres 10

2017-08-09 Thread Peter Geoghegan

There are actually very many customizations to collations that are
possible beyond what the "stock" ICU collations provide (whatever
"stock" means). Some of these are really cool, and I can imagine use
cases where they are very compelling that have nothing to do with
internationalization (such customizations are how we should eventually
implement case-insensitive collations, once the infrastructure for
doing that without breaking hashing is in place).

I'd like to give a demo on what is already possible, but not currently
documented. I didn't see anyone else comment on this, including Peter
E (maybe I missed that?). We should improve the documentation in this
area, to get this into the hands of users.

Say we're unhappy that numbers come first, which we see here:

postgres=# select * from (select '1a' i union select '1b' union select
'1c' union select 'a1' union select 'b2' union select 'c3') j order by
i collate "en-x-icu";
 i

 1a
 1b
 1c
 a1
 b2
 c3
(6 rows)

We may do this to get our desired sort order:

postgres=# create collation digitlast (provider=icu,
locale='en-u-kr-latn-digit');
CREATE COLLATION
postgres=# select * from (select '1a' i union select '1b' union select
'1c' union select 'a1' union select 'b2' union select 'c3') j order by
i collate "digitlast";
 i

 a1
 b2
 c3
 1a
 1b
 1c
(6 rows)

Note that 'kr' is a specific BCP47 Key [1]. Many different options can
be set in this manner.

Let's say we are unhappy with the fact that capital letters sort
higher than lowercase:

postgres=# select * from (select 'B' i union select 'b' union select
'A' union select 'a') j order by i collate "en-x-icu";
 i
───
 a
 A
 b
 B
(4 rows)

ICU provides a solution here, too:

postgres=# create collation capitalfirst (provider=icu, locale='en-u-kf-upper');
CREATE COLLATION
postgres=#
select * from (select 'B' i union select 'b' union select 'A' union
select 'a') j order by i collate "capitalfirst";
 i
───
 A
 a
 B
 b
(4 rows)

And, yes -- you can even *combine* these two options by creating a
third custom collation. That can be spelled
'en-u-kf-upper-kr-latn-digit', in case you were wondering.

Users have certainly complained about not liking this or that aspect
of how glibc sorts text many times over the years, particularly around
things like how whitespace and punctuation are handled, which they can
now do something about [2]. Users can also have numbers sort like
numbers should when compared against other numbers, by using the
numericOrdering option (not shown). numericOrdering would be great for
things like alphanumeric invoice numbers, or the alphanumeric car
registration plate numbers that are used in certain countries [3],
with fixed number/letter fields. These options are very powerful.

[1] http://unicode.org/reports/tr35/tr35-collation.html#Setting_Options
[2] http://unicode.org/reports/tr35/tr35-collation.html#Common_Settings
[3] 
https://en.wikipedia.org/wiki/Vehicle_registration_plates_of_the_Republic_of_Ireland
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cache lookup errors with functions manipulation object addresses

2017-08-09 Thread Michael Paquier

On Wed, Aug 9, 2017 at 2:47 PM, Aleksander Alekseev
 wrote:
> I believe this patch is "Ready for Committer".
>
> The new status of this patch is: Ready for Committer

Thanks for the lookup, but I think that this is still hasty as no
discussion has happened about a couple of APIs to get object names.
Types, operators and functions have no "cache lookup" and prefer
producing names like "???" or "-". We may want to change that. Or not.
The current patch keeps existing interfaces as much as possible but
those existing caveats remain.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Michael Paquier

On Wed, Aug 9, 2017 at 9:25 PM, Alvaro Herrera  wrote:
> Tom Lane wrote:
>> I think generating whatever we can from a single authoritative file
>> is indeed a good idea.
>
> Yay.

Indeed.

>> But I had the impression that people also wanted to enforce a rule
>> about "only one use of each wait event name", which'd require a
>> checker script, no?  (I'm not really convinced that we need such a
>> rule, fwiw.)
>
> I'm not convinced of that, either.  Of the possible problems in the
> area, that seems the lesser one.

With a minimal maintenance effort we can be careful enough. I think
that a comment for example in pgstat.c about the usage uniqueness
would be an adapted answer.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-08-09 Thread Ashutosh Bapat

On Thu, Aug 10, 2017 at 5:09 AM, Thomas Munro
 wrote:
> On Thu, Jul 27, 2017 at 5:20 PM, Thomas Munro
>  wrote:
>> Agreed.  Here's a version that skips those useless detail messages
>> using a coding pattern I found elsewhere.
>
> Rebased after bf6b9e94.
>

Please add this to the next commitfest.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Tsunakawa, Takayuki

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
> There are no case-insensitive collations in PostgreSQL (yet).

That's sad news, as I expected ICU to bring its various collation features to 
PostgreSQL.  I hope it will be easy to add them.


From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Geoghegan
> This is not an answer to the question you asked, but it may interest you
> to know that ICU uses JIS X 4061 for Japanese, unlike Glibc. This will give
> more useful results when sorting Japanese.
> 
> The best explanation of the difference that I can understand is here, under
> "Why do CJK strings sort incorrectly in Unicode?":
> 
> https://dev.mysql.com/doc/refman/5.5/en/faqs-cjk.html

Thanks a lot.  MysQL seems to have many collations, doesn't it?

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-08-09 Thread Thomas Munro

On Thu, Aug 10, 2017 at 1:16 PM, Ashutosh Bapat
 wrote:
> Please add this to the next commitfest.

Done:  https://commitfest.postgresql.org/14/1229/

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Peter Geoghegan

On Wed, Aug 9, 2017 at 6:19 PM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Eisentraut
>> There are no case-insensitive collations in PostgreSQL (yet).
>
> That's sad news, as I expected ICU to bring its various collation features to 
> PostgreSQL.  I hope it will be easy to add them.

The reason it is not easy is that text equality is based on strict
binary equality. We would have to teach hash operator classes about
collations to fix this, and make text_eq() hash strxfrm() or
something. That requires special work. You can ask ICU for case
insensitivity with Postgres 10, but the strcmp() tie breaker within
varstr_cmp() will prevent it from being truly case insensitive.

>> The best explanation of the difference that I can understand is here, under
>> "Why do CJK strings sort incorrectly in Unicode?":
>>
>> https://dev.mysql.com/doc/refman/5.5/en/faqs-cjk.html
>
> Thanks a lot.  MysQL seems to have many collations, doesn't it?

Well, it depends on how you define collation, which actually gets
quite complicated with ICU. You can combine certain options together
with great flexibility, it seems (see my e-mail from earlier today --
"What users can do with custom ICU collations in Postgres 10"). Let's
assume that there are 10 distinct options for each locale that each
independently affect sort order for the base collation of the locale.
I believe that there are at least 10 such options that change things,
and maybe a lot more. Theoretically, this means that there are an
absolutely enormous number of possible collations with ICU.

Now, I wouldn't *actually* say that we have many thousands of
collations with ICU, because that doesn't seem like a sensible way to
explain ICU Postgres collations. The way that we think about this from
using libc doesn't work well for ICU. Instead, I would say that we
have hundreds of locales (not collations), each of which support some
variant options (e.g., traditional Spanish sort order, alternative
Japanese sort order, pictographic emoji sorting), with some further
generic options for varying how case it handled, how numbers are
handled, and other things like that.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log LDAP "diagnostic messages"?

2017-08-09 Thread Thomas Munro

On Thu, Jul 27, 2017 at 5:20 PM, Thomas Munro
 wrote:
> Agreed.  Here's a version that skips those useless detail messages
> using a coding pattern I found elsewhere.

Rebased after bf6b9e94.

-- 
Thomas Munro
http://www.enterprisedb.com


ldap-diagnostic-message-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-09 Thread Andres Freund

On 2017-08-08 19:25:37 -0400, Peter Eisentraut wrote:
> On 8/7/17 21:06, Noah Misch wrote:
> >> That would fit.  Until v10 (commit 1e8a850), PQconnectStart() had no 
> >> in-tree
> >> callers outside of libpq itself.
> > [Action required within three days.  This is a generic notification.]
> > 
> > The above-described topic is currently a PostgreSQL 10 open item.  Peter,
> > since you committed the patch believed to have created it, you own this open
> > item.
> 
> I don't think I can usefully contribute to this.  Could someone else
> take it?

I've written up a patch at
http://archives.postgresql.org/message-id/20170806215521.d6fq4esvx7s5ejka%40alap3.anarazel.de
but I can't test this either. We really need somebody with access to
window to verify whether it works.  That patch needs some adjustments as
remarked by Tom, but can be tested as is.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Logical replication message type 'Y' is missing in docs

2017-08-09 Thread Masahiko Sawada

Hi,

There is a type of logical replication message 'Y' for data types, but
it's not documented in section 52.9. Logical Replication Message
Formats. Attached patch fixes this. I think it can be PG10 item.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


message_type_in_protocol_sgml.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How can I find a specific collation in pg_collation when using ICU?

2017-08-09 Thread Tom Lane

Peter Geoghegan  writes:
> On Wed, Aug 9, 2017 at 6:19 PM, Tsunakawa, Takayuki
>  wrote:
>> That's sad news, as I expected ICU to bring its various collation features 
>> to PostgreSQL.  I hope it will be easy to add them.

> The reason it is not easy is that text equality is based on strict
> binary equality.

Yeah.  You can change sort order all you want, but you can't easily
change the system's notion of equality.  But when people ask for "case
insensitive" collation, they generally want that too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Causal reads take II

2017-08-09 Thread Thomas Munro

On Mon, Jul 31, 2017 at 5:49 PM, Thomas Munro
 wrote:
>  Here is a version to put it back.

Rebased after conflicting commit 030273b7.  Now using format-patch
with a commit message to keep track of review/discussion history.

-- 
Thomas Munro
http://www.enterprisedb.com


synchronous-replay-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] expanding inheritance in partition bound order

2017-08-09 Thread Amit Langote

On 2017/08/05 2:25, Robert Haas wrote:
> Concretely, my proposal is:
> 
> 1. Before calling RelationGetPartitionDispatchInfo, the calling code
> should use find_all_inheritors to lock all the relevant relations (or
> the planner could use find_all_inheritors to get a list of relation
> OIDs, store it in the plan in order, and then at execution time we
> visit them in that order and lock them).
> 
> 2. RelationGetPartitionDispatchInfo assumes the relations are already locked.
> 
> 3. While we're optimizing, in the first loop inside of
> RelationGetPartitionDispatchInfo, don't call heap_open().  Instead,
> use get_rel_relkind() to see whether we've got a partitioned table; if
> so, open it.  If not, there's no need.
> 
> 4. For safety, add a function bool RelationLockHeldByMe(Oid) and add
> to this loop a check if (!RelationLockHeldByMe(lfirst_oid(lc1))
> elog(ERROR, ...).  Might be interesting to stuff that check into the
> relation_open(..., NoLock) path, too.
> 
> One objection to this line of attack is that there might be a good
> case for locking only the partitioned inheritors first and then going
> back and locking the leaf nodes in a second pass, or even only when
> required for a particular row.  However, that doesn't require putting
> everything in bound order - it only requires moving the partitioned
> children to the beginning of the list.  And I think rather than having
> new logic for that, we should teach find_inheritance_children() to do
> that directly.  I have a feeling Ashutosh is going to cringe at this
> suggestion, but my idea is to do this by denormalizing: add a column
> to pg_inherits indicating whether the child is of
> RELKIND_PARTITIONED_TABLE.  Then, when find_inheritance_children scans
> pg_inherits, it can pull that flag out for free along with the
> relation OID, and qsort() first by the flag and then by the OID.  It
> can also return the number of initial elements of its return value
> which have that flag set.
> 
> Then, in find_all_inheritors, we split rels_list into
> partitioned_rels_list and other_rels_list, and process
> partitioned_rels_list in its entirety before touching other_rels_list;
> they get concatenated at the end.
> 
> Now, find_all_inheritors and find_inheritance_children can also grow a
> flag bool only_partitioned_children; if set, then we skip the
> unpartitioned children entirely.
> 
> With all that in place, you can call find_all_inheritors(blah blah,
> false) to lock the whole hierarchy, or find_all_inheritors(blah blah,
> true) to lock just the partitioned tables in the hierarchy.  You get a
> consistent lock order either way, and if you start with only the
> partitioned tables and later want the leaf partitions too, you just go
> through the partitioned children in the order they were returned and
> find_inheritance_children(blah blah, false) on each one of them and
> the lock order is exactly consistent with what you would have gotten
> if you'd done find_all_inheritors(blah blah, false) originally.

I tried implementing this in the attached set of patches.

[PATCH 2/5] Teach pg_inherits.c a bit about partitioning

Both find_inheritance_children and find_all_inheritors now list
partitioned child tables before non-partitioned ones and return
the number of partitioned tables in an optional output argument

[PATCH 3/5] Relieve RelationGetPartitionDispatchInfo() of doing locking

Anyone who wants to call RelationGetPartitionDispatchInfo() must first
acquire locks using find_all_inheritors.

TODO: Add RelationLockHeldByMe() and put if (!RelationLockHeldByMe())
elog(ERROR, ...) check in RelationGetPartitionDispatchInfo()

[PATCH 4/5] Teach expand_inherited_rtentry to use partition bound order

After locking the child tables using find_all_inheritors, we discard
the list of child table OIDs that it generates and rebuild the same
using the information returned by RelationGetPartitionDispatchInfo.

[PATCH 5/5] Store in pg_inherits if a child is a partitioned table

Catalog changes so that is_partitioned property of child tables is now
stored in pg_inherits.  This avoids consulting syscache to get that
property as is currently implemented in patch 2/5.


I haven't yet done anything about changing the timing of opening and
locking leaf partitions, because it will require some more thinking about
the required planner changes.  But the above set of patches will get us
far enough to get leaf partition sub-plans appear in the partition bound
order (same order as what partition tuple-routing uses in the executor).

With the above patches, we get the desired order of child sub-plans in
Append and ModifyTable plans for partitioned tables:

create table p (a int) partition by range (a);
create table p4 partition of p for values from (30) to (40);
create table p3 partition of p for values from (20) to (30);
create table p2 partition of p for values from (10) to (20);
create table p1 partition of p for values from (1) to (10);
create table p0 partition of p for values

Re: [HACKERS] dubious error message from partition.c

2017-08-09 Thread Amit Langote

On 2017/08/10 5:59, Tom Lane wrote:
> Alvaro Herrera  writes:
>> Dean Rasheed wrote:
>>> There was an earlier suggestion to use "greater than or equal to". I
>>> think that would work quite well:
> 
>> Is it possible to detect the equality case specifically and use a
>> different errdetail?  Something like "the lower bound %s is equal to the
>> upper bound" (obviously without including both in the message.)
> 
> That seems like overkill.  I'm good with "greater than or equal to".

Attached updated patch has "greater than or equal to".

Thanks,
Amit
From 11d40a76dc9e56a85fe480fd5690a53406cea9c3 Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 9 Aug 2017 14:36:57 +0900
Subject: [PATCH] Fix error message when apprently empty range bound is
 specified

---
 src/backend/catalog/partition.c| 10 +++-
 src/backend/utils/adt/ruleutils.c  | 84 +++---
 src/include/utils/ruleutils.h  |  1 +
 src/test/regress/expected/create_table.out |  6 ++-
 4 files changed, 56 insertions(+), 45 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index dcc7f8af27..65630421d3 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -722,10 +722,16 @@ check_new_partition_bound(char *relname, Relation parent,
 */
if (partition_rbound_cmp(key, lower->datums, 
lower->kind, true,

 upper) >= 0)
+   {
ereport(ERROR,

(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-errmsg("cannot create 
range partition with empty range"),
-
parser_errposition(pstate, spec->location)));
+errmsg("invalid range 
bound specification for partition \"%s\"",
+   
relname),
+errdetail("Specified 
lower bound %s is greater than or equal to upper bound %s.",
+   
get_range_partbound_string(spec->lowerdatums),
+   
get_range_partbound_string(spec->upperdatums)),
+   
parser_errposition(pstate, spec->location)));
+   }
 
if (partdesc->nparts > 0)
{
diff --git a/src/backend/utils/adt/ruleutils.c 
b/src/backend/utils/adt/ruleutils.c
index d83377d1d8..0faa0204ce 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -8722,47 +8722,9 @@ get_rule_expr(Node *node, deparse_context *context,
   
list_length(spec->lowerdatums) ==
   
list_length(spec->upperdatums));
 
-   appendStringInfoString(buf, 
"FOR VALUES FROM (");
-   sep = "";
-   foreach(cell, spec->lowerdatums)
-   {
-   PartitionRangeDatum 
*datum =
-   
castNode(PartitionRangeDatum, lfirst(cell));
-
-   
appendStringInfoString(buf, sep);
-   if (datum->kind == 
PARTITION_RANGE_DATUM_MINVALUE)
-   
appendStringInfoString(buf, "MINVALUE");
-   else if (datum->kind == 
PARTITION_RANGE_DATUM_MAXVALUE)
-   
appendStringInfoString(buf, "MAXVALUE");
-   else
-   {
-   Const  *val 
= castNode(Const, datum->value);
-
-   
get_const_expr(val, context, -1);
-   }
-   sep = ", ";
-   }
-   appendStringInfoString(buf, ") 
TO (");
-   sep = "";
-

Re: [HACKERS] parallelize queries containing initplans

2017-08-09 Thread Haribabu Kommi

On Wed, Aug 9, 2017 at 9:26 PM, Amit Kapila  wrote:

> On Wed, Aug 9, 2017 at 10:24 AM, Haribabu Kommi
>  wrote:
> >
> >
> > For the following query the parallel plan is not chosen. The query
> contains
> > an init plan that refer the outer node.
> >
>
> We don't want to generate the parallel plan for such cases.  Whenever
> initplan refers to any outer node (aka correlated plan), it won't
> generate a parallel plan.  Also, for t2, it doesn't choose a parallel
> plan because one-time filter refers to the outer node (again
> correlated plan case). Basically, till now we don't support parallel
> plan for any case where the correlated plan is used.  So, it is
> perfectly valid that it doesn't use parallel plan here.

Thanks for providing the details.

> Here if you notice the parallel node t2 refers to the initplan which
> can be parallelised after this patch.  Basically, whenever the
> initplan is attached at or above Gather node, we compute its value and
> pass down to workers.
>

Thanks for the details. I checked the code also.

By the way, I tested the patch with by DML support for parallel patch to
check the returning of clause of insert, and all the returning clause init
plans
are parallel plans with this patch.

> By the way, the patch doesn't apply on HEAD, so attached rebased patch.

Thanks for the updated patch. I have some comments and I am yet to finish
the review.

+ /*
+ * We don't want to generate gather or gather merge node if there are
+ * initplans at some query level below the current query level as those
+ * plans could be parallel-unsafe or undirect correlated plans.  Ensure to
+ * mark all the partial and non-partial paths for a relation at this level
+ * to be parallel-unsafe.
+ */
+ if (is_initplan_below_current_query_level(root))
+ {
+ foreach(lc, rel->partial_pathlist)
+ {
+ Path   *subpath = (Path *) lfirst(lc);
+
+ subpath->parallel_safe = false;
+ }
+
+ foreach(lc, rel->pathlist)
+ {
+ Path   *subpath = (Path *) lfirst(lc);
+
+ subpath->parallel_safe = false;
+ }
+ return;
+ }
+

The above part of the code is almost same in mutiple places, is it possible
to change to function?

+ node->initParam = NULL;

This new parameter is set to NULL in make_gather function, the same
parameter
is added to GatherMerge structure also, but anyway this parameter is set to
NULL
makeNode macro, why only setting it to here, but not the other place.

Do we need to set it to default value such as NULL or false if it is
already the same value?
This is not related to the above parameter, for all existing parameters
also.

+ if (IsA(plan, Gather))
+ ((Gather *) plan)->initParam = bms_intersect(plan->lefttree->extParam,
initSetParam);
+ else if (IsA(plan, GatherMerge))
+ ((GatherMerge *) plan)->initParam =
bms_intersect(plan->lefttree->extParam, initSetParam);
+ else
+ elog(ERROR, "unrecognized node type: %d", nodeTag(plan));

The else case is not possible, because it is already validated for Gather
or GatherMerge.
Can we change it simple if and else?

Regards,
Hari Babu
Fujitsu Australia

[HACKERS] Review of DDL replication solution

2017-08-09 Thread Jeremy Finzel

Hello!

I am new to Postgres backend development, and have created a recent project
that has minimal C code.  It is a transparent DDL replication solution
built on top of pglogical, but with a concept that could be used also with
other logical replication solutions if they support some mechanism to
propagate SQL to subscribers.  I describe some of its limitations in the
docs:
https://github.com/enova/pgl_ddl_deploy

I would greatly appreciate some critique/review of the C code in
particular:
https://github.com/enova/pgl_ddl_deploy/blob/master/pgl_ddl_deploy.c

It is extremely simple and most of the code is shared from other existing
Postgres code.  But if we put this into production, I'd love for some more
experienced devs to at least take a look at it.

Likewise, I would love if in general anyone would try to see if using this
is feasible in their environment.

I have posted already on the bdr/pglogical list, but would also like to
share this with the broader community.

I appreciate all positive/negative feedback.  Cheers!

Jeremy Finzel
(DBA at Enova International)

Re: [HACKERS] Parallel Append implementation

2017-08-09 Thread Robert Haas

On Wed, Jul 5, 2017 at 7:53 AM, Amit Khandekar  wrote:
>> This is not applicable on the latest head i.e. commit --
>> 08aed6604de2e6a9f4d499818d7c641cbf5eb9f7, looks like need a rebasing.
>
> Thanks for notifying. Attached is the rebased version of the patch.

This again needs a rebase.

(And, hey everybody, it also needs some review!)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-08-09 Thread Thomas Munro

On Tue, Aug 8, 2017 at 8:51 PM, Ashutosh Bapat
 wrote:
> Updated patches attached.

Hi,

I started reviewing this.  It's nicely commented, but it's also very
complicated, and it's going to take me several rounds to understand
what all the parts do, but here's some assorted feedback after reading
some parts of the patches, some tinkering and quite a bit of time
spent trying to break it (so far unsuccessfully).

On my computer it took ~1.5 seconds to plan a 1000 partition join,
~7.1 seconds to plan a 2000 partition join, and ~50 seconds to plan a
4000 partition join.  I poked around in a profiler a bit and saw that
for the 2000 partition case I spent almost half the time in
create_plan->...->prepare_sort_from_pathkeys->find_ec_member_for_tle,
and about half of that was in bms_is_subset.  The other half the time
was in query_planner->make_one_rel which spent 2/3 of its time in
set_rel_size->add_child_rel_equivalences->bms_overlap and the other
1/3 in standard_join_search.

One micro-optimisation opportunity I noticed was in those
bms_is_subset and bms_overlap calls.  The Bitmapsets don't tend to
have trailing words but often have hundreds of empty leading words.
If I hack bitmapset.{c,h} so that it tracks first_non_empty_wordnum
and then adjust bms_is_subset and bms_overlap so that they start their
searches at Min(a->first_non_empty_wordnum,
b->first_non_empty_wordnum) then the planning time improves
measurably:

1000 partitions: ~1.5s -> 1.3s
2000 partitions: ~7.1s -> 5.8s
4000 partitions: ~50s -> ~44s

When using list-based partitions, it must be possible to omit the part
of a join key that is implied by the partition because the partition
has only one list value.  For example, if I create a two level
hierarchy with one partition per US state and then time-based range
partitions under that, the state part of this merge condition is
redundant:

 Merge Cond: ((sales_wy_2017_10.state =
purchases_wy_2017_10.state) AND (sales_wy_2017_10.created =
purchases_wy_2017_10.created))

0003-Refactor-partition_bounds_equal-to-be-used-without-P.patch

-partition_bounds_equal(PartitionKey key,
+partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval,
   PartitionBoundInfo b1,
PartitionBoundInfo b2)

I wonder is there any value in creating a struct to represent the
common part of PartitionKey and PartitionScheme that functions like
this and others need?  Maybe not.  Perhaps you didn't want to make
PartitionKey contain a PartitionScheme because then you'd lose the
property that every pointer to PartitionScheme in the system must be a
pointer to an interned (canonical) PartitionScheme, so it's always
safe to compare pointers to test for equality?

0005-Canonical-partition-scheme.patch:

+/*
+ * get_relation_partition_info
+ *
+ * Retrieves partitioning information for a given relation.
+ *
+ * For a partitioned table it sets partitioning scheme, partition key
+ * expressions, number of partitions and OIDs of partitions in the given
+ * RelOptInfo.
+ */
+static void
+get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+   Relation relation)

Would this be better called "set_relation_partition_info"?  It doesn't
really "retrieve" the information, it "installs" it.

+{
+   PartitionDesc part_desc;
+
+   /* No partitioning information for an unpartitioned relation. */
+   if (relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE ||
+   !(rel->part_scheme = find_partition_scheme(root, relation)))
+   return;

Here and elsewhere you use the idiom !(foo = bar), which is perfectly
good C in my book but I understand the project convention is to avoid
implicit pointer->boolean conversion and to prefer expressions like
(foo = bar) != NULL and there is certainly a lot more code like that.

0007-Partition-wise-join-implementation.patch

+   {"enable_partition_wise_join", PGC_USERSET, QUERY_TUNING_METHOD,

This GUC should appear in postgresql.conf.sample.

I'm chewing on 0007.  More soon.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Tom Lane

Thomas Munro  writes:
> As for whether hypothetical check scripts would ever be run, I was
> thinking we should stick them under some make target that developers
> run all the time anyway -- perhaps "check".  Shouldn't we catch simple
> mechanically detectable problems as early in the pipeline as possible?

Adding overhead to every developer's every test cycle doesn't sound
like a win.  Possibly a reasonable compromise would be to have some
buildfarm members running this check.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Walsender timeouts and large transactions

2017-08-09 Thread Petr Jelinek

On 02/08/17 19:35, Yura Sokolov wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
> 
> There is no check for (last_reply_timestamp <= 0 || wal_sender_timeout <= 0) 
> as in other places
> (in WalSndKeepaliveIfNecessary for example).
> 
> I don't think moving update of 'now' down to end of loop body is correct:
> there are calls to ProcessConfigFile with SyncRepInitConfig, 
> ProcessRepliesIfAny that can
> last non-negligible time. It could lead to over sleeping due to larger 
> computed sleeptime.
> Though I could be mistaken.
> 
> I'm not sure about moving `if (!pg_is_send_pending())` in a body loop after 
> WalSndKeepaliveIfNecessary.
> Is it necessary? But it looks harmless at least.
> 

We also need to do actual timeout handing so that the timeout is not
deferred to the end of the transaction (Which is why I moved `if
(!pg_is_send_pending())` under WalSndCheckTimeOut() and
WalSndKeepaliveIfNecessary() calls).

> Could patch be reduced to check after first `if (!pg_is_sendpending())` ? 
> like:
> 
>   if (!pq_is_send_pending())
> - return;
> + {
> + if (last_reply_timestamp <= 0 || wal_sender_timeout <= 0)
> + {
> + CHECK_FOR_INTERRUPTS();
> + return;
> + }
> + if (now <= TimestampTzPlusMilliseconds(last_reply_timestamp, 
> wal_sender_timeout / 2))
> + return;
> + }
> 
> If not, what problem prevents?

We should do CHECK_FOR_INTERRUPTS() independently of pq_is_send_pending
so that it's possible to stop walsender while it's processing large
transaction.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timing-sensitive case in src/test/recovery TAP tests

2017-08-09 Thread Tom Lane

Michael Paquier  writes:
> On Tue, Aug 8, 2017 at 11:33 PM, Tom Lane  wrote:
>> Michael Paquier  writes:
>>> I got the same thought, wondering as well if get_slot_xmins should be
>>> renamed check_slot_xmins with the is() tests moved inside it as well.
>>> Not sure if that's worth the API ugliness though.

>> Mmm, doesn't seem like that's worth doing, but I'm half tempted to merge
>> wait_slot_xmins into get_slot_xmins so you can't skip it ...

> Let's do that please. Merging both was my first feeling when
> refactoring this test upthread. Should I send a patch?

Sure, have at it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Torsten Zuehlsdorff




On 01.08.2017 22:00, Ildus K wrote:

On Tue, 1 Aug 2017 15:33:08 -0400
Robert Haas  wrote:


On Tue, Aug 1, 2017 at 3:10 PM, Ildus K
 wrote:

So this would break pg_upgrade for tsvector columns?


I added a function that will convert old tsvectors on the fly. It's
the approach used in hstore before.


Does that mean the answer to the question that I asked is "yes, but I
have a workaround" or does it mean that the answer is "no"?



It's a workaround. DatumGetTSVector and
DatumGetTSVectorCopy will upgrade tsvector on the fly if it
has old format.


I'm not familiar with pg_upgrade, but want to ask: should this 
workaround be part of pg_upgrade?


Greetings,
Torsten


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "make check" with non-GNU make

2017-08-09 Thread Tom Lane

Thomas Munro  writes:
> On Wed, Aug 9, 2017 at 3:44 PM, Tom Lane  wrote:
>> Hmm, looking into Makefile.global.in, that step seems to be conditional on
>> MAKELEVEL:

> Ah, right.  That coding is recommended in the GNU make manual to
> distinguish from explicit invocation and recursive invocation.
> FreeBSD make also seems to set MAKELEVEL.

Oh, that would do it.

> Doing this fixes the
> problem for me, though it feels a bit sneaky:

> -  $${GMAKE} $@ ; \
> +  MAKELEVEL= $${GMAKE} $@ ; \

Seems like a reasonable fix to me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Tom Lane

Robert Haas  writes:
> On Wed, Aug 9, 2017 at 10:14 AM, Tom Lane  wrote:
>> Thomas Munro  writes:
>>> As for whether hypothetical check scripts would ever be run, I was
>>> thinking we should stick them under some make target that developers
>>> run all the time anyway -- perhaps "check".  Shouldn't we catch simple
>>> mechanically detectable problems as early in the pipeline as possible?

>> Adding overhead to every developer's every test cycle doesn't sound
>> like a win.

> If it takes 100ms, nobody's gonna notice.

I doubt running a perl script that analyzes the entire backend source
code is gonna take 100ms.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove 1MB size limit in tsvector

2017-08-09 Thread Ildus Kurbangaliev

On Wed, 9 Aug 2017 09:01:44 +0200
Torsten Zuehlsdorff  wrote:

> On 01.08.2017 22:00, Ildus K wrote:
> > On Tue, 1 Aug 2017 15:33:08 -0400
> > Robert Haas  wrote:
> >   
> >> On Tue, Aug 1, 2017 at 3:10 PM, Ildus K
> >>  wrote:  
>  So this would break pg_upgrade for tsvector columns?  
> >>>
> >>> I added a function that will convert old tsvectors on the fly.
> >>> It's the approach used in hstore before.  
> >>
> >> Does that mean the answer to the question that I asked is "yes,
> >> but I have a workaround" or does it mean that the answer is "no"?
> >>  
> > 
> > It's a workaround. DatumGetTSVector and
> > DatumGetTSVectorCopy will upgrade tsvector on the fly if it
> > has old format.  
> 
> I'm not familiar with pg_upgrade, but want to ask: should this 
> workaround be part of pg_upgrade?
> 
> Greetings,
> Torsten

I chose the way when the data remains the same, until the user decides
to update it. I'm not so familiar with pg_upgrade myself and I don't
see now how the data will be converted with it, but it will anyway
increase downtime which is the shorter the better.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallelize queries containing initplans

2017-08-09 Thread Haribabu Kommi

On Wed, Aug 9, 2017 at 8:54 PM, Kuntal Ghosh 
wrote:

> On Wed, Aug 9, 2017 at 10:24 AM, Haribabu Kommi
>  wrote:
> >
> > I tested the latest patch and the parallel plan is getting choose for
> most
> > of
> > the init plans.
> >
> Thanks for testing.
>
> > For the following query the parallel plan is not chosen. The query
> contains
> > an init plan that refer the outer node.
> >
> > postgres=# explain analyze select * from t1 where t1.i in (select t2.i
> from
> > t2 where t1.k = (select max(k) from t3 where t3.i=t1.i));
> >  QUERY PLAN
> > 
> -
> >  Seq Scan on t1  (cost=0.00..22426.28 rows=448 width=12) (actual
> > time=8.335..132.557 rows=2 loops=1)
> >Filter: (SubPlan 2)
> >Rows Removed by Filter: 894
> >SubPlan 2
> >  ->  Result  (cost=16.27..31.26 rows=999 width=4) (actual
> > time=0.146..0.146 rows=0 loops=896)
> >One-Time Filter: (t1.k = $1)
> >InitPlan 1 (returns $1)
> >  ->  Aggregate  (cost=16.25..16.27 rows=1 width=4) (actual
> > time=0.145..0.145 rows=1 loops=896)
> >->  Seq Scan on t3  (cost=0.00..16.25 rows=2 width=4)
> > (actual time=0.131..0.144 rows=0 loops=896)
> >  Filter: (i = t1.i)
> >  Rows Removed by Filter: 900
> >->  Seq Scan on t2  (cost=16.27..31.26 rows=999 width=4)
> (actual
> > time=0.012..0.013 rows=10 loops=2)
> >  Planning time: 0.272 ms
> >  Execution time: 132.623 ms
> > (14 rows)
> >
> An observation is that the filter at Result node can't be pushed down
> to the sequential scan on t2 because the filter is on t1. So, it has
> to scan the complete t2 relation and send all the tuple to upper node,
> a worst case for parallelism. Probably, this is the reason the
> optimizer doesn't pick parallel plan for the above case.
>
> Just for clarification, do you see any changes in the plan after
> forcing parallelism(parallel_tuple_cost, parallel_setup_cost,
> min_parallel_table_scan_size=0)?


There is no plan change with parallel* GUC changes.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread Robert Haas

On Wed, Aug 9, 2017 at 10:14 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> As for whether hypothetical check scripts would ever be run, I was
>> thinking we should stick them under some make target that developers
>> run all the time anyway -- perhaps "check".  Shouldn't we catch simple
>> mechanically detectable problems as early in the pipeline as possible?
>
> Adding overhead to every developer's every test cycle doesn't sound
> like a win.

If it takes 100ms, nobody's gonna notice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix inadequacies in recently added wait events

2017-08-09 Thread David Fetter

On Wed, Aug 09, 2017 at 10:56:50AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Wed, Aug 9, 2017 at 10:14 AM, Tom Lane  wrote:
> >> Thomas Munro  writes:
> >>> As for whether hypothetical check scripts would ever be run, I
> >>> was thinking we should stick them under some make target that
> >>> developers run all the time anyway -- perhaps "check".
> >>> Shouldn't we catch simple mechanically detectable problems as
> >>> early in the pipeline as possible?
> 
> >> Adding overhead to every developer's every test cycle doesn't
> >> sound like a win.
> 
> > If it takes 100ms, nobody's gonna notice.
> 
> I doubt running a perl script that analyzes the entire backend
> source code is gonna take 100ms.

What would be a reasonable maximum amount of time for such a check to take?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

70 matches

Mail list logo