date:20170517

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Tsunakawa, Takayuki

Hello Robert, Tom,

Thank you for being kind enough to explain.  I think I could understand your 
concern.

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Who is right is a judgement call, but I don't think it's self-evident that
> users want to ignore anything and everything that might have gone wrong
> with the connection to the first server, rather than only those things which
> resemble a down server.  It seems quite possible to me that if we had defined
> it as you are proposing, somebody would now be arguing for a behavior change
> in the other direction.

Judgment call... so, I understood that it's a matter of choosing between 
helping to detect configuration errors early or service continuity.  Hmm, I'd 
like to know how other databases treat this, but I couldn't find useful 
information after some Google search.  I wonder whether I sould ask PgJDBC 
people if they know something, because they chose service continuity.


From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> The bigger picture here is that we only want to fail past transient errors,
> not configuration errors.  I'm willing to err in favor of regarding doubtful
> cases as transient, but most server login rejections aren't for transient
> causes.

I got "doubtful cases" as ones such as specifying non-existent host or an 
unused port number.  In that case, the configuration error can't be 
distinguished from the server failure.

What do you think of the following cases?  Don't you want to connect to other 
servers?

* The DBA shuts down the database.  The server takes a long time to do 
checkpointing.  During the shutdown checkpoint, libpq tries to connect to the 
server and receive an error "the database system is shutting down."

* The former primary failed and now is trying to start as a standby, catching 
up by applying WAL.  During the recovery, libpq tries to connect to the server 
and receive an error "the database system is performing recovery."

* The database server crashed due to a bug.  Unfortunately, the server takes 
unexpectedly long time to shut down because it takes many seconds to write the 
stats file (as you remember, Tom-san experienced 57 seconds to write the stats 
file during regression tests.)  During the stats file write, libpq tries to 
connect to the server and receive an error "the database system is shutting 
down."

These are equivalent to server failure.  I believe we should prioritize 
rescuing errors during operation over detecting configuration errors.


> Of course, the user would have to try connections to both foo and bar to
> be sure that they're both configured correctly.  But he might try
> "host=foo,bar" and "host=bar,foo" and figure he was OK, not noticing that
> both connections had silently been made to bar.

In that case, I think he would specify "host=foo" and "host=bar" in turn, 
because he would be worried about where he's connected if he specified multiple 
hosts.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-05-17 Thread Michael Paquier

On Fri, Mar 31, 2017 at 9:58 PM, Ashutosh Bapat
 wrote:
> Then the question is why not to allow savepoints as well? For that we
> have to fix transaction block state machine.

I agree with this argument. I have been looking at the patch, and what
it does is definitely incorrect. Any query string including multiple
queries sent to the server is executed as a single transaction. So,
while the current behavior of the server is definitely incorrect for
savepoints in this case, the proposed patch does not fix anything but
actually makes things worse. I think that instead of failing,
savepoints should be able to work properly. As you say cursors are
handled correctly, savepoints should fall under the same rules.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection

2017-05-17 Thread Tsunakawa, Takayuki

From: Michael Paquier [mailto:michael.paqu...@gmail.com]
> On Fri, Mar 31, 2017 at 9:58 PM, Ashutosh Bapat
>  wrote:
> > Then the question is why not to allow savepoints as well? For that we
> > have to fix transaction block state machine.
> 
> I agree with this argument. I have been looking at the patch, and what it
> does is definitely incorrect. Any query string including multiple queries
> sent to the server is executed as a single transaction. So, while the current
> behavior of the server is definitely incorrect for savepoints in this case,
> the proposed patch does not fix anything but actually makes things worse.
> I think that instead of failing, savepoints should be able to work properly.
> As you say cursors are handled correctly, savepoints should fall under the
> same rules.

Yes, I'm in favor of your opinion.  I'll put more thought into whether it's 
feasible with invasive code.

Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)

2017-05-17 Thread Thomas Munro

On Wed, May 17, 2017 at 6:04 PM, Amit Langote
 wrote:
> On 2017/05/17 11:22, Thomas Munro wrote:
>> Here is that patch.  Thoughts?
>
> I looked at the patch and noticed that there might be some confusion about
> what's in the EState.es_root_result_relations array.

Thanks for looking at this!

> ...
>
> targetRelInfo should instead be set to mtstate->rootResultRelInfo that was
> set in ExecInitModifyTable() as described above, IOW, as follows:
>
> /* Partitioned table. */
> if (mtstate->rootResultRelInfo != NULL)
> targetRelInfo = mtstate->rootResultRelInfo;

Ah, I see.  Thank you.  Fixed in the attached.

-- 
Thomas Munro
http://www.enterprisedb.com


transition-tuples-from-child-tables-v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ExecModifyTable function and trigger issues for foreign tables

2017-05-17 Thread Ildus Kurbangaliev

On Wed, 17 May 2017 15:28:24 +0900
Michael Paquier  wrote:

> On Tue, May 16, 2017 at 11:26 PM, Ildus Kurbangaliev
>  wrote:
> > On Tue, 16 May 2017 21:36:11 +0900
> > Etsuro Fujita  wrote:  
> >> On 2017/05/16 21:11, Ashutosh Bapat wrote:  
> >> > On Tue, May 16, 2017 at 5:35 PM, Ildus Kurbangaliev
> >> >  wrote:  
> >>  
> >> >> I agree. Maybe this issue should be added to Postgresql Open
> >> >> Items? I think there should be some complex solution that fixes
> >> >> not only triggers for foreign tables at table partitioning, but
> >> >> covers other possible not working cases.  
> >>  
> >> > I doubt if this is an open item, since DMLs on foreign tables are
> >> > supported since 9.3 and support to add foreign tables to
> >> > inheritance was added back in 9.5.  
> >>
> >> I think this issue was introduced by the latter, so that was my
> >> fault.
> >>
> >> One approach I came up with to fix this issue is to rewrite the
> >> targetList entries of an inherited UPDATE/DELETE properly using
> >> rewriteTargetListUD, when generating a plan for each child table in
> >> inheritance_planner.  Attached is a WIP patch for that.  Maybe I am
> >> missing something, though.  
> 
> Could this patch include some regression tests to see at what extent
> it has been tested? We surely don't want to see that broken again in
> the future as well. (Nit: I did not look at the patch in details yet)
> 
> > I tested the patch, looks good.  
> 
> What kind of tests did you do?

I tested update triggers for foreign table when regular table is a
parent and foreign table is a child. Case like this:

explain verbose update parent set val = random();
  QUERY PLAN
---
 Update on public.parent  (cost=0.00..258.63 rows=5535 width=10)
   Update on public.parent
   Update on public.child1
   Foreign Update on public.ftable // we have triggers on ftable here

> 
>   junkfilter = resultRelInfo->ri_junkFilter;
> + tupleid = NULL;
>   estate->es_result_relation_info = resultRelInfo;
> Er, what is that?

That fixes the bug when tupleid from regular tuple is used to get
oldtuple for triggers of foreign table.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-17 Thread Ashutosh Bapat

On Tue, May 16, 2017 at 9:01 PM, Robert Haas  wrote:
> On Tue, May 16, 2017 at 8:57 AM, Jeevan Ladhe
>  wrote:
>> I have fixed the crash in attached patch.
>> Also the patch needed bit of adjustments due to recent commit.
>> I have re-based the patch on latest commit.
>
> +boolhas_default;/* Is there a default partition?
> Currently false
> + * for a range partitioned table */
> +intdefault_index;/* Index of the default list
> partition. -1 for
> + * range partitioned tables */
>

We have has_null and null_index for list partitioning. There
null_index == -1 = has_null. May be Rahila and/or Jeevan just copied
that style. Probably we should change that as well?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ExecModifyTable function and trigger issues for foreign tables

2017-05-17 Thread Etsuro Fujita


On 2017/05/17 17:54, Ildus Kurbangaliev wrote:

On Wed, 17 May 2017 15:28:24 +0900
Michael Paquier  wrote:


On Tue, May 16, 2017 at 11:26 PM, Ildus Kurbangaliev
 wrote:

On Tue, 16 May 2017 21:36:11 +0900
Etsuro Fujita  wrote:

On 2017/05/16 21:11, Ashutosh Bapat wrote:

On Tue, May 16, 2017 at 5:35 PM, Ildus Kurbangaliev
 wrote:



I agree. Maybe this issue should be added to Postgresql Open
Items? I think there should be some complex solution that fixes
not only triggers for foreign tables at table partitioning, but
covers other possible not working cases.



I doubt if this is an open item, since DMLs on foreign tables are
supported since 9.3 and support to add foreign tables to
inheritance was added back in 9.5.


I think this issue was introduced by the latter, so that was my
fault.

One approach I came up with to fix this issue is to rewrite the
targetList entries of an inherited UPDATE/DELETE properly using
rewriteTargetListUD, when generating a plan for each child table in
inheritance_planner.  Attached is a WIP patch for that.  Maybe I am
missing something, though.


Could this patch include some regression tests to see at what extent
it has been tested? We surely don't want to see that broken again in
the future as well. (Nit: I did not look at the patch in details yet)


OK, I'll include regression tests in the next version of the patch.


I tested the patch, looks good.


What kind of tests did you do?


I tested update triggers for foreign table when regular table is a
parent and foreign table is a child. Case like this:

explain verbose update parent set val = random();
  QUERY PLAN
---
 Update on public.parent  (cost=0.00..258.63 rows=5535 width=10)
   Update on public.parent
   Update on public.child1
   Foreign Update on public.ftable // we have triggers on ftable here



  junkfilter = resultRelInfo->ri_junkFilter;
+ tupleid = NULL;
  estate->es_result_relation_info = resultRelInfo;
Er, what is that?


That fixes the bug when tupleid from regular tuple is used to get
oldtuple for triggers of foreign table.


That's right.  Let me explain in more detail.  Currently, tupleid is 
only initialized at the top of ExecModifyTable, so if we just loop 
within the for(;;) code in that function (without returning RETURNING to 
caller), tupleid won't be initialized even when advancing to next 
subplan in case of inherited UPDATE/DELETE.  This would cause a problem. 
 Assume that the current subplan is for the parent, which is a plain 
table, that the next subplan is for the child, which is a foreign table, 
and that the foreign table has a BEFORE trigger, as tested by Ildus.  In 
that case the current subplan would set tupleid to ctid for each row 
from the plain table, and after advancing to the next subplan, the 
subplan would set oldtuple to wholerow for the first row from the 
foreign table, *without initializing tupleid to NULL*, and then call 
ExecBRUpdateTriggers/ExecBRDeleteTriggers during ExecUpdate/ExecDelete, 
which would cause an assertion error for 
Assert(HeapTupleIsValid(fdw_trigtuple) ^ ItemPointerIsValid(tupleid)) in 
those trigger functions, because oldtuple and tupleid are both valid. 
So, tupleid should be initialized at least when advancing to next 
subplan.  It might be better to initialize that at each iteration of the 
for(;;) code, like oldtuple, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improvement in log message of logical replication worker

2017-05-17 Thread Masahiko Sawada

On Wed, May 17, 2017 at 11:33 AM, Masahiko Sawada  wrote:
> On Wed, May 17, 2017 at 4:08 AM, Robert Haas  wrote:
>> On Fri, May 12, 2017 at 12:30 AM, Masahiko Sawada  
>> wrote:
>>> Attached small patch adds relid to these log messages if it's valid.
>>> I'd like to propose it for PG10 if possible, but since It's not a bug
>>> and not an open item we can add it to next CF.
>>
>> To me, it seems completely reasonable to add this as an open item.
>>
>
> Okay, added this to the open items.
>

Attached minor updated patch.

BTW, why should max_replication_slots be set more than 0 even on the
subscriber side? It's documented but I could not understand reason..

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


log_message_improvement_for_logical_repl_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-17 Thread Amit Langote

On 2017/05/17 17:58, Ashutosh Bapat wrote:
> On Tue, May 16, 2017 at 9:01 PM, Robert Haas  wrote:
>> On Tue, May 16, 2017 at 8:57 AM, Jeevan Ladhe
>>  wrote:
>>> I have fixed the crash in attached patch.
>>> Also the patch needed bit of adjustments due to recent commit.
>>> I have re-based the patch on latest commit.
>>
>> +boolhas_default;/* Is there a default partition?
>> Currently false
>> + * for a range partitioned table */
>> +intdefault_index;/* Index of the default list
>> partition. -1 for
>> + * range partitioned tables */
>>
> 
> We have has_null and null_index for list partitioning. There
> null_index == -1 = has_null. May be Rahila and/or Jeevan just copied
> that style. Probably we should change that as well?

Probably a good idea.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Increasing parallel workers at runtime

2017-05-17 Thread Amit Kapila

On Wed, May 17, 2017 at 5:45 AM, Haribabu Kommi
 wrote:
>
>
> On Wed, May 17, 2017 at 2:35 AM, Robert Haas  wrote:
>>
>
>
> Ok. In this approach, we need to share some of the gatherstate structure
> members in the shared memory to access by the other background process
> or some better logic to update these details whenever a new worker gets
> allotted.
>

I don't think you need to share that stuff as we initialize the
dsm/queues based on planned workers.  I think one thing you might want
to consider is to see if this new background worker can detect whether
the query has already finished before launching workers.  Yet another
way to look at this problem is to have an idea of Alternative
non-parallel plans corresponding to parallel plans.  As of now, we
only have one plan during execution of parallel plan and if we don't
have enough workers, we have no choice but to proceed with that plan,
however, if we have some Alternative plan which is non-parallel, it
might be better to use the same if we can't allocate enough number of
workers for the query.  IIRC, this has been discussed previously
during the development of Parallel Sequential Scan patch and I think
some other databases uses some similar technique for this problem.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Increasing parallel workers at runtime

2017-05-17 Thread Amit Kapila

On Tue, May 16, 2017 at 2:14 PM, Ashutosh Bapat
 wrote:
> On Mon, May 15, 2017 at 9:23 PM, Robert Haas  wrote:
>
> Also, looking at the patch, it doesn't look like it take enough care
> to build execution state of new worker so that it can participate in a
> running query. I may be wrong, but the execution state initialization
> routines are written with the assumption that all the workers start
> simultaneously?
>

No such assumptions, workers started later can also join the execution
of the query.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-17 Thread Jeevan Ladhe

On Wed, May 17, 2017 at 2:28 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Tue, May 16, 2017 at 9:01 PM, Robert Haas 
> wrote:
> > On Tue, May 16, 2017 at 8:57 AM, Jeevan Ladhe
> >  wrote:
> >> I have fixed the crash in attached patch.
> >> Also the patch needed bit of adjustments due to recent commit.
> >> I have re-based the patch on latest commit.
> >
> > +boolhas_default;/* Is there a default partition?
> > Currently false
> > + * for a range partitioned table */
> > +intdefault_index;/* Index of the default list
> > partition. -1 for
> > + * range partitioned tables */
> >
>
> We have has_null and null_index for list partitioning. There
> null_index == -1 = has_null. May be Rahila and/or Jeevan just copied
> that style. Probably we should change that as well?
>
>
I agree with Ashutosh.
I had given similar comment on earlier version of patch[1]
,
and  Rahila reverted
with above reasoning, hence did not change the logic she introduced.

Probably its a good idea to have a separate patch that removes has_null
logic,
in a separate thread.

[1]
https://www.postgresql.org/message-id/CAOgcT0NUUQXWRXmeVKkYTDQvWoKLYRMiPbc89ua6i_gG8FPDmA%40mail.gmail.com

Regards,
Jeevan Ladhe.

[HACKERS] fix hard-coded index in make_partition_op_expr

2017-05-17 Thread Jeevan Ladhe

Hi,

While browsing through the partitioning code, I noticed that a recent commit
f8bffe9e6d700fd34759a92e47930ce9ba7dcbd5, which fixes multi-column range
partitioning constraints, introduced a function make_partition_op_expr, that
takes keynum as a input parameter to identify the index of the partition
key.
In case of range partition we can have multiple partition keys but for list
partition we have only one. Considering that, I think following code does
not
cause any side-effect logically(and may be a oversight while moving the code
from function get_qual_for_list to this function):

saopexpr->inputcollid = key->partcollation[0];
saopexpr->args = list_make2(arg1, arg2);

But as we have keynum now, should we be using it to index
key->partcollation,
instead of a hard coded '0'.

I tried to look around in list partitioning and hard coded '0' is used at
all
the places, but that code is either list specific or does not have
availability
of partitioned key index.

Attached patch does this small change in make_partition_op_expr.
Another way is to, have an Assert in case of PARTITION_STRATEGY_LIST:
Assert(keynum != 0);

PFA.

Regards,
Jeevan Ladhe


fix_indexing_make_partition_op_expr_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPDATE of partition key

2017-05-17 Thread Amit Kapila

On Wed, May 17, 2017 at 12:06 PM, Dilip Kumar  wrote:
> On Fri, May 12, 2017 at 4:17 PM, Amit Khandekar  
> wrote:
>> Option 3
>> 
>>
>> BR, AR delete triggers on source partition
>> BR, AR insert triggers on destination partition.
>>
>> Rationale :
>> Since the update is converted to delete+insert, just skip the update
>> triggers completely.
>
> +1 to option3
>
..
> Earlier I thought that option1 is better but later I think that this
> can complicate the situation as we are firing first BR update then BR
> delete and can change the row multiple time and defining such
> behaviour can be complicated.
>

If we have to go by this theory, then the option you have preferred
will still execute BR triggers for both delete and insert, so input
row can still be changed twice.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPDATE of partition key

2017-05-17 Thread Amit Kapila

On Mon, May 15, 2017 at 5:28 PM, Robert Haas  wrote:
> On Fri, May 12, 2017 at 3:07 AM, Amit Kapila  wrote:
>> I agree with you that it might not be straightforward to make it work,
>> but now that earliest it can go is v11, do we want to try doing
>> something other than just documenting it.  What I could read from this
>> e-mail thread is that you are intending towards just documenting it
>> for the first cut of this feature. However, both Greg and Simon are of
>> opinion that we should do something about this and even patch Author
>> (Amit Khandekar) has shown some inclination to do something about this
>> point (return error to the user in some way), so I think we can't
>> ignore this point.
>>
>> I think now that we have some more time, it is better to try something
>> based on a couple of ideas floating in this thread to address this
>> point and see if we can come up with something doable without a big
>> architecture change.
>>
>> What is your take on this point now?
>
> I still don't think it's worth spending a bit on this, especially not
> with WARM probably gobbling up multiple bits.  Reclaiming the bits
> seems like a good idea, but spending one on this still seems to me
> like it's probably not the best use of our increasingly-limited supply
> of infomask bits.
>

I think we can do this even without using an additional infomask bit.
As suggested by Greg up thread, we can set InvalidBlockId in ctid to
indicate such an update.

>  Now, Simon and Greg may still feel otherwise, of
> course.
>
> I could get behind providing an option to turn this behavior on and
> off at the level of the partitioned table.
>

Sure that sounds like a viable option and we can set the default value
as false.  However, it might be better if we can detect the same
internally without big changes.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPDATE of partition key

2017-05-17 Thread Dilip Kumar

On Wed, May 17, 2017 at 3:15 PM, Amit Kapila  wrote:
>> Earlier I thought that option1 is better but later I think that this
>> can complicate the situation as we are firing first BR update then BR
>> delete and can change the row multiple time and defining such
>> behaviour can be complicated.
>>
>
> If we have to go by this theory, then the option you have preferred
> will still execute BR triggers for both delete and insert, so input
> row can still be changed twice.

Yeah, right as per my theory above Option3 have the same problem.

But after putting some more thought I realised that only for "Before
Update" or the "Before Insert" trigger row can be changed. Correct me
if I am assuming something wrong?

So now again option3 will make more sense.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] hash partitioning

2017-05-17 Thread amul sul

On Wed, May 17, 2017 at 11:11 AM, Ashutosh Bapat
 wrote:
> On Wed, May 17, 2017 at 12:04 AM, amul sul  wrote:
>> On Tue, May 16, 2017 at 10:00 PM, Dilip Kumar  wrote:
>>> On Tue, May 16, 2017 at 4:22 PM, amul sul  wrote:
 v6 patch has bug in partition oid mapping and indexing, fixed in the
 attached version.

 Now partition oids will be arranged in the ascending order of hash
 partition bound  (i.e. modulus and remainder sorting order)
>>>
>>> Thanks for the update patch. I have some more comments.
>>>
>>> 
>>> + if (spec->remainder < 0)
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
>>> +  errmsg("hash partition remainder must be less than modulus")));
>>>
>>> I think this error message is not correct, you might want to change it
>>> to "hash partition remainder must be non-negative integer"
>>>
>>
>> Fixed in the attached version;  used "hash partition remainder must be
>> greater than or equal to 0" instead.
>
> I would suggest "non-zero positive", since that's what we are using in
> the documentation.
>

Understood, Fixed in the attached version.

>>
>>> ---
>>>
>>> + The table is partitioned by specifying remainder and modulus for 
>>> each
>>> + partition. Each partition holds rows for which the hash value of
>>>
>>> Wouldn't it be better to say "modulus and remainder" instead of
>>> "remainder and modulus" then it will be consistent?
>>>
>>
>> You are correct, fixed in the attached version.
>>
>>> ---
>>> +   An UPDATE that causes a row to move from one partition 
>>> to
>>> +   another fails, because
>>>
>>> fails, because -> fails because
>>>
>>
>> This hunk is no longer exists in the attached patch, that was mistaken
>> copied, sorry about that.
>>
>>> ---
>>>
>>> Wouldn't it be a good idea to document how to increase the number of
>>> hash partitions, I think we can document it somewhere with an example,
>>> something like Robert explained upthread?
>>>
>>> create table foo (a integer, b text) partition by hash (a);
>>> create table foo1 partition of foo with (modulus 2, remainder 0);
>>> create table foo2 partition of foo with (modulus 2, remainder 1);
>>>
>>> You can detach foo1, create two new partitions with modulus 4 and
>>> remainders 0 and 2, and move the data over from the old partition
>>>
>>> I think it will be good information for a user to have? or it's
>>> already documented and I missed it?
>>>
>
> This is already part of documentation contained in the patch.
>
> Here are some more comments
> @@ -3296,6 +3311,14 @@ ALTER TABLE measurement ATTACH PARTITION
> measurement_y2008m02
> not the partitioned table.
>
>   
> +
> + 
> +  
> +   An UPDATE that causes a row to move from one partition to
> +   another fails, because the new value of the row fails to satisfy the
> +   implicit partition constraint of the original partition.
> +  
> + 
>  
>  
>  
> The description in this chunk is applicable to all the kinds of partitioning.
> Why should it be part of a patch implementing hash partitioning?
>

This was already addressed in the previous patch(v8).

> +Declarative partitioning only supports hash, list and range
> +partitioning, whereas table inheritance allows data to be
> +divided in a manner of the user's choosing.  (Note, however,
> +that if constraint exclusion is unable to prune partitions
> +effectively, query performance will be very poor.)
> Looks like the line width is less than 80 characters.
>

Fixed in the attached version.

> In partition_bounds_equal(), please add comments explaining why is it safe to
> check just the indexes? May be we should add code under assertion to make sure
> that the datums are equal as well.

Added assert in the attached version.

> The comment could be something
> like, "If two partitioned tables have different greatest moduli, their
> partition schemes don't match. If they have same greatest moduli, and
> all remainders have different indexes, they all have same modulus
> specified and the partitions are ordered by remainders, thus indexes
> array will be an identity i.e. index[i] = i. If the partition
> corresponding to a given remainder exists, it will have same index
> entry for both partitioned tables or if it's missing it will be -1.
> Thus if indexes array matches, corresponding datums array matches. If
> there are multiple remainders corresponding to a given partition,
> their partitions are ordered by the lowest of the remainders, thus if
> indexes array matches, both of the tables have same indexes arrays, in
> both the tables remainders corresponding to multiple partitions all
> have same indexes and thus same modulus. Thus again if the indexes are
> same, datums are same.".
>

Thanks, added with minor modification.

> In the same function
> if (key->strategy == PARTITION_STRATEGY_HASH)
> {
> intgreatest_modulus;
>

[HACKERS] remove unnecessary flag has_null from PartitionBoundInfoData

2017-05-17 Thread Jeevan Ladhe

Hi,

As discussed in default partition thread[1]
,
here is the patch to remove
has_null from PartitionBoundInfoData structure.
Basically flag has_null is not needed and null_index can be checked if the
current bound is having a null value or not.

For simplicity of future use, in attached patch I have introduced a macro
that
would return TRUE if the given bound has null.

PFA.

[1].
https://www.postgresql.org/message-id/CAFjFpRcvD-jf6z6-5mjx4tm-8NSj8Wnh%2B6%3DL6OB9Qy2yKX0HCg%40mail.gmail.com

Regards,
Jeevan Ladhe


remove_has_null_PartitionBoundInfoData_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-05-17 Thread Amit Kapila

On Tue, May 16, 2017 at 9:39 PM, Robert Haas  wrote:
> On Sun, May 7, 2017 at 11:54 AM, Robert Haas  wrote:
>> I'm having second thoughts based on some more experimentation I did
>> this morning.  I'll update again once I've had a bit more time to poke
>> at it.
>
> So the issue that I noticed here is that this problem really isn't
> confined to abort processing.  If we ROLLBACK TO SAVEPOINT or ABORT
> TRANSACTION on an open connection, we really do not know what the
> state of that connection is until we get an acknowledgement that the
> command completed successfully.  The patch handles that.  However, the
> same is true if we're sending a SAVEPOINT or RELEASE SAVEPOINT
> command, and the patch that I posted doesn't do anything about those
> cases.  I think it would be best to fix all transaction control
> commands in a symmetric manner.
>

+1.  Why not similar behavior for any other statements executed in
this module by do_sql_command?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Get stuck when dropping a subscription during synchronizing table

2017-05-17 Thread Masahiko Sawada

On Mon, May 15, 2017 at 8:02 PM, Kyotaro HORIGUCHI
 wrote:
> Hello,
>
> At Fri, 12 May 2017 17:24:07 +0900, Masahiko Sawada  
> wrote in 
>> On Fri, May 12, 2017 at 11:24 AM, Masahiko Sawada  
>> wrote:
>> > On Thu, May 11, 2017 at 6:16 PM, Petr Jelinek
>> >  wrote:
>> >> On 11/05/17 10:10, Masahiko Sawada wrote:
>> >>> On Thu, May 11, 2017 at 4:06 PM, Michael Paquier
>> >>>  wrote:
>>  On Wed, May 10, 2017 at 11:57 AM, Masahiko Sawada 
>>   wrote:
>> > Barring any objections, I'll add these two issues to open item.
>> 
>>  It seems to me that those open items have not been added yet to the
>>  list. If I am following correctly, they could be defined as follows:
>>  - Dropping subscription may stuck if done during tablesync.
>>  -- Analyze deadlock issues with DROP SUBSCRIPTION and apply worker 
>>  process.
>> >>
>> >> I think the solution to this is to reintroduce the LWLock that was
>> >> removed and replaced with the exclusive lock on catalog [1]. I am afraid
>> >> that correct way of handling this is to do both LWLock and catalog lock
>> >> (first LWLock under which we kill the workers and then catalog lock so
>> >> that something that prevents launcher from restarting them is held till
>> >> the end of transaction).
>> >
>> > I agree to reintroduce LWLock and to stop logical rep worker first and
>> > then modify catalog. That way we can reduce catalog lock level (maybe
>> > to RowExclusiveLock) so that apply worker can see it. Also I think
>> > that we need to do more things like in order to prevent that we keep
>> > to hold LWLock until end of transaction, because holding LWLock until
>> > end of transaction is not good idea and could be cause of deadlock. So
>> > for example we can commit the transaction in DropSubscription after
>> > cleaned pg_subscription record and all its dependencies and then start
>> > new transaction for the remaining work. Of course we also need to
>> > disallow DROP SUBSCRIPTION being executed in a user transaction
>> > though.
>>
>> Attached two draft patches to solve these issues.
>>
>> Attached 0001 patch reintroduces LogicalRepLauncherLock and makes DROP
>> SUBSCRIPTION keep holding it until commit. To prevent from deadlock
>> possibility, I disallowed DROP SUBSCRIPTION being called in a
>> transaction block. But there might be more sensible solution for this.
>> please give me feedback.
>
> +* Protect against launcher restarting the worker. This lock will
> +* be released at commit.
>
> This is wrong. COMMIT doesn't release left-over LWLocks, only
> ABORT does (precisely, it seems intended to fire on ERRORs). So
> with this patch, the second DROP SUBSCRIPTION is stuck on the
> LWLock acquired at the first time. And as Petr said, LWLock with
> such a duration seems bad.

Oh I understood. Thank you for pointing out.

>
> The cause seems to be that workers ignore sigterm on certain
> conditions. One of the choke points is GetSubscription, the other
> is get_subscription_list. I think we can treat the both cases
> without LWLocks.
>
> The attached patch does that.
>
> - heap_close + UnlockRelationOid in get_subscription_list() is
>   equivalent to one heap_close or relation_close but I took seeming
>   symmetricity.
>
> - 0.5 seconds for the sleep in ApplyWorkerMain is quite
>   arbitrary. NAPTIME_PER_CYCLE * 1000 could be used instead.
>
> - NULL MySubscription without SIGTERM might not need to be an
>   ERROR.
>
> Any more thoughts?

I think the above changes can solve this issue but It seems to me that
holding AccessExclusiveLock on pg_subscription by DROP SUBSCRIPTION
until commit could lead another deadlock problem in the future. So I'd
to contrive ways to reduce lock level somehow if possible. For
example, if we change the apply launcher so that it gets the
subscription list only when pg_subscription gets invalid, apply
launcher cannot try to launch the apply worker being stopped. We
invalidate pg_subscription at commit of DROP SUBSCRIPTION and the
apply launcher can get new subscription list which doesn't include the
entry we removed. That way we can reduce lock level to
ShareUpdateExclusiveLock and solve this issue.
Also in your patch, we need to change DROP SUBSCRIPTION as well to
resolve another case I encountered, where DROP SUBSCRIPTION waits for
apply worker while holding a tuple lock on pg_subscription_rel and the
apply worker waits for same tuple on pg_subscription_rel in
SetSubscriptionRelState().

>
>
> FYI, I reproduced the situation by the following steps. This
> effectively reproduced the situation without delay insertion for
> me.
>
> # Creating 5 tables with 10 rows on the publisher
> create table t1 (a int);
> ...
> create table t5 (a int);
> insert into t1 (select * from generate_series(0, 9) a);
> ...
> insert into t5 (select * from generate_series(0, 9) a);
> create publication p1 for table t1, t2, t3, t4, t5;
>
>
> # Subscribe them, wait 1sec, then unsbscribe.
> create table t1 (a

Re: [HACKERS] UPDATE of partition key

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 6:29 AM, Amit Kapila  wrote:
> I think we can do this even without using an additional infomask bit.
> As suggested by Greg up thread, we can set InvalidBlockId in ctid to
> indicate such an update.

Hmm.  How would that work?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Increasing parallel workers at runtime

2017-05-17 Thread Robert Haas

On Tue, May 16, 2017 at 8:15 PM, Haribabu Kommi
 wrote:
> Ok. In this approach, we need to share some of the gatherstate structure
> members in the shared memory to access by the other background process
> or some better logic to update these details whenever a new worker gets
> allotted.

What I'm imagining is a shared memory array with up to, say, 32 slots
(or, say, max_worker_processes slots).  Each slot stores the PID of
the leader, the number of workers expected, and the number of workers
actually obtained.  The leader tries to claim a slot for its details
(just giving up if there are none available) and clears the slot again
at the end of the query (if it got one, and even in case of error).
The whole array is protected by a new LWLock.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPDATE of partition key

2017-05-17 Thread Rushabh Lathia

On Wed, May 17, 2017 at 12:06 PM, Dilip Kumar  wrote:

> On Fri, May 12, 2017 at 4:17 PM, Amit Khandekar 
> wrote:
> > Option 3
> > 
> >
> > BR, AR delete triggers on source partition
> > BR, AR insert triggers on destination partition.
> >
> > Rationale :
> > Since the update is converted to delete+insert, just skip the update
> > triggers completely.
>
> +1 to option3
> Generally, BR triggers are used for updating the ROW value and AR
> triggers to VALIDATE the row or to modify some other tables.  So it
> seems that we can fire the triggers what is actual operation is
> happening at the partition level.
>
> For source partition, it's only the delete operation (no update
> happened) so we fire delete triggers and for the destination only
> insert operations so fire only inserts triggers.  That will keep the
> things simple.  And, it will also be in sync with the actual partition
> level delete/insert operations.
>
> We may argue that user might have declared only update triggers and as
> he has executed the update operation he may expect those triggers to
> get fired.  But, I think this behaviour can be documented with the
> proper logic that if the user is updating the partition key then he
> must be ready with the Delete/Insert triggers also, he can not rely
> only upon update level triggers.
>
>
Right, that is even my concern. That user might had declared only update
triggers and when user executing UPDATE its expect it to get call - but
with option 3 its not happening.

In term of consistency option 1 looks better. Its doing the same what
its been implemented for the UPSERT - so that user might be already
aware of trigger behaviour. Plus if we document the behaviour then it
sounds correct -

- Original command was UPDATE so BR update
- Later found that its ROW movement - so BR delete followed by AR delete
- Then Insert in new partition - so BR INSERT followed by AR Insert.

But again I am not quite sure how good it will be to compare the partition
behaviour with the UPSERT.




> Earlier I thought that option1 is better but later I think that this
> can complicate the situation as we are firing first BR update then BR
> delete and can change the row multiple time and defining such
> behaviour can be complicated.
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Rushabh Lathia

Re: [HACKERS] [COMMITTERS] pgsql: Tag refs/tags/REL_10_BETA1 was created

2017-05-17 Thread Andrew Dunstan



On 05/16/2017 10:37 PM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 5/16/17 18:14, pg...@postgresql.org wrote:
>>> Tag refs/tags/REL_10_BETA1 was created.
>> Was this change in naming pattern intentional?
> Yes, it was.  Andrew Dunstan suggested[1] during the
> two-part-version-number discussion that we should start including a "_"
> after REL in tag and branch names for v10 and later, so that those names
> would sort correctly compared to the tag/branch names for earlier branches
> (at least when using C locale).  I believe his main concern was some logic
> in the buildfarm, but it seems like a good idea in general.
>
> When we get to v100, we'll need some other hack to make it work ...
> but I plan to be safely dead by then.
>


Me too. Since posterity will be deprived of both of us let's note that
the same hack will work, we'll just need two underscores.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Tag refs/tags/REL_10_BETA1 was created

2017-05-17 Thread Peter Eisentraut

On 5/16/17 22:37, Tom Lane wrote:
> BTW, I now remember having wondered[2] if we should make any other changes
> in version-number formatting while we're at it, like maybe "10beta1"
> should be "10.beta1".

That's not a naming format I've ever seen.

I think the current format is fine.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Tag refs/tags/REL_10_BETA1 was created

2017-05-17 Thread Magnus Hagander

On Wed, May 17, 2017 at 2:41 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 5/16/17 22:37, Tom Lane wrote:
> > BTW, I now remember having wondered[2] if we should make any other
> changes
> > in version-number formatting while we're at it, like maybe "10beta1"
> > should be "10.beta1".
>
> That's not a naming format I've ever seen.
>
> I think the current format is fine.
>
>
+1. I have also never seen that one, and think the current one is good.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] [POC] hash partitioning

2017-05-17 Thread Ashutosh Bapat

On Wed, May 17, 2017 at 2:07 PM, amul sul  wrote:

>
>> In partition_bounds_equal(), please add comments explaining why is it safe to
>> check just the indexes? May be we should add code under assertion to make 
>> sure
>> that the datums are equal as well.
>
> Added assert in the attached version.
>
>> The comment could be something
>> like, "If two partitioned tables have different greatest moduli, their
>> partition schemes don't match. If they have same greatest moduli, and
>> all remainders have different indexes, they all have same modulus
>> specified and the partitions are ordered by remainders, thus indexes
>> array will be an identity i.e. index[i] = i. If the partition
>> corresponding to a given remainder exists, it will have same index
>> entry for both partitioned tables or if it's missing it will be -1.
>> Thus if indexes array matches, corresponding datums array matches. If
>> there are multiple remainders corresponding to a given partition,
>> their partitions are ordered by the lowest of the remainders, thus if
>> indexes array matches, both of the tables have same indexes arrays, in
>> both the tables remainders corresponding to multiple partitions all
>> have same indexes and thus same modulus. Thus again if the indexes are
>> same, datums are same.".
>>
>
> Thanks, added with minor modification.

I have reworded this slightly better. See the attached patch as diff of 0002.

>
>> In the same function
>> if (key->strategy == PARTITION_STRATEGY_HASH)
>> {
>> intgreatest_modulus;
>>
>> /*
>>  * Compare greatest modulus of hash partition bound which
>>  * is the last element of datums array.
>>  */
>> if (b1->datums[b1->ndatums - 1][0] != b2->datums[b2->ndatums - 1][0])
>> return false;
>>
>> /* Compare indexes */
>> greatest_modulus = DatumGetInt32(b1->datums[b1->ndatums - 1][0]);
>> for (i = 0; i < greatest_modulus; i++)
>> if (b1->indexes[i] != b2->indexes[i])
>> return false;
>> }
>> if we return true from where this block ends, we will save one indenation 
>> level
>> for rest of the code and also FWIW extra diffs in this patch because of this
>> indentation change.
>>
>
> I still do believe having this code in the IF - ELSE block will be
> better for longterm, rather having code clutter to avoid diff that
> unpleasant for now.

Ok, I will leave it to the committer to judge.


Comments on the tests
+#ifdef USE_ASSERT_CHECKING
+{
+/*
+ * Hash partition bound stores modulus and remainder at
+ * b1->datums[i][0] and b1->datums[i][1] position respectively.
+ */
+for (i = 0; i < b1->ndatums; i++)
+Assert((b1->datums[i][0] == b2->datums[i][0] &&
+b1->datums[i][1] == b2->datums[i][1]));
+}
+#endif
Why do we need extra {} here?

Comments on testcases
+CREATE TABLE hpart_1 PARTITION OF hash_parted FOR VALUES WITH
(modulus 8, remainder 0);
+CREATE TABLE fail_part (LIKE hpart_1 INCLUDING CONSTRAINTS);
+ALTER TABLE hash_parted ATTACH PARTITION fail_part FOR VALUES WITH
(modulus 4, remainder 0);
Probably you should also test the other-way round case i.e. create modulus 4,
remainder 0 partition and then try to add partitions with modulus 8, remainder
4 and modulus 8, remainder 0. That should fail.

Why to create two tables hash_parted and hash_parted2, you should be able to
test with only a single table.

+INSERT INTO hpart_2 VALUES (3, 'a');
+DELETE FROM hpart_2;
+INSERT INTO hpart_5_a (a, b) VALUES (6, 'a');
This is slightly tricky. On different platforms the row may map to different
partitions depending upon how the values are hashed. So, this test may not be
portable on all the platforms. Probably you should add such testcases with a
custom hash operator class which is identity function as suggested by Robert.
This also applies to the tests in insert.sql and update.sql for partitioned
table without custom opclass.

+-- delete the faulting row and also add a constraint to skip the scan
+ALTER TABLE hpart_5 ADD CONSTRAINT hcheck_a CHECK (a IN (5)), ALTER a
SET NOT NULL;
The constraint is not same as the implicit constraint added for that partition.
I am not sure whether it's really going to avoid the scan. Did you verify it?
If yes, then how?

+ALTER TABLE hash_parted2 ATTACH PARTITION fail_part FOR VALUES WITH
(modulus 3, remainder 2);
+ERROR:  every hash partition modulus must be a factor of the next
larger modulus
We should add this test with at least two partitions in there so that we can
check lower and upper modulus. Also, testing with some interesting
bounds discussed earlier
in this mail e.g. adding modulus 15 when 5, 10, 60 exist will be better than
testing with 3, 4 and 8.

+ERROR:  cannot use collation for hash partition key column "a"
This seems to indicate that we can not specify collation for hash partition key
column, which isn't true.

[HACKERS] Re: [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Alvaro Herrera

Tom Lane wrote:

> src/bin/pg_basebackup/pg_basebackup.c  | 24 +-
> src/bin/pg_waldump/pg_waldump.c| 18 ++---

There are some changes here that should be reverted; for instance:

-   printf(_("  -c, --checkpoint=fast|spread\n"
-" set fast or spread checkpointing\n"));
+   printf(_("  -c, --checkpoint=fast|spread\n"));
+   printf(_(" set fast or spread checkpointing\n"));

>From the translator's point of view the patched version doesn't make
sense because they are two separate strings.  In the original, it's a
single translatable string.  Particularly in pg_waldump's -p, where a
phrase is now cut in the middle.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Alvaro Herrera  writes:
> There are some changes here that should be reverted; for instance:

> -   printf(_("  -c, --checkpoint=fast|spread\n"
> -" set fast or spread checkpointing\n"));
> +   printf(_("  -c, --checkpoint=fast|spread\n"));
> +   printf(_(" set fast or spread checkpointing\n"));

> From the translator's point of view the patched version doesn't make
> sense because they are two separate strings.  In the original, it's a
> single translatable string.  Particularly in pg_waldump's -p, where a
> phrase is now cut in the middle.

What I was concerned about was that pgindent will reindent the second
line so that it's impossible to tell whether the spacing is correct.
That might not matter to translators but it will be a problem for
source-level maintenance.

Maybe we should rethink the whole idea of breaking these entries across
lines, and just accept that the commentary doesn't line up with other
lines:

   printf(_("  -c, --checkpoint=fast|spread  set fast or spread 
checkpointing\n"));

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Alvaro Herrera  writes:
>> ... Particularly in pg_waldump's -p, where a
>> phrase is now cut in the middle.

BTW, I would say that the problem with -p is that somebody failed to
understand the difference between --help and a man page.  That entry
should be

  -p, --path=PATHdirectory in which to find log segment files

full stop.  I'm not sure that the other multi-line entries in
pg_waldump's --help have an excuse to live, either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small improvement to compactify_tuples

2017-05-17 Thread Sokolov Yura


Alvaro Herrera писал 2017-05-15 20:13:

As I understand, these patches are logically separate, so putting them
together in a single file isn't such a great idea.  If you don't edit
the patches further, then you're all set because we already have the
previously archived patches.  Next commitfest starts in a few months
yet, and if you feel the need to submit corrected versions in the
meantime, please do submit in separate files.  (Some would even argue
that each should be its own thread, but I don't think that's 
necessary.)


Thank you for explanation.

I'm adding new version of first patch with minor improvement:
- I added detection of a case when all buckets are trivial
  (ie 0 or 1 element). In this case no need to sort buckets at all.

--
Sokolov Yura aka funny.falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres CompanyFrom 4fafc870690b7de5aeab7e583780546e8170f6d0 Mon Sep 17 00:00:00 2001
From: Sokolov Yura 
Date: Mon, 15 May 2017 14:23:39 +0300
Subject: [PATCH 1/2] Improve compactify_tuples

Items passed to compactify_tuples are almost sorted. So call to general
qsort makes unnecessary overhead on function calls (swapfunc, med,
itemoffcompare).

This patch implements bucket sort:
- one pass of stable prefix sort on high 8 bits of offset
- and insertion sort for buckets larger than 1 element

Also for smaller arrays shell sort is used.

Insertion and Shell sort are implemented using macroses.

This approach allows to save 3% of cpu in degenerate case
(highly intensive HOT random updates on unlogged table with
 synchronized_commit=off), and speeds up WAL replaying (as were
found by Heikki Linnakangas).

Same approach were implemented by Heikki Linnakangas some time ago with
several distinctions.
---
 src/backend/storage/page/bufpage.c | 86 --
 src/include/c.h| 59 ++
 2 files changed, 142 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index fdf045a45b..c1bb11c354 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -434,6 +434,85 @@ itemoffcompare(const void *itemidp1, const void *itemidp2)
 }
 
 /*
+ * Sort an array of itemIdSort's on itemoff, descending.
+ *
+ * It uses Shell sort. Given array is small and itemoffcompare is inlined,
+ * it is much faster than call to qsort.
+ */
+static void
+sort_itemIds_small(itemIdSort itemidbase, int nitems)
+{
+	pg_shell_sort(itemIdSortData, itemidbase, nitems, itemoffcompare);
+}
+
+/*
+ * Sort an array of itemIdSort's on itemoff, descending.
+ *
+ * It uses bucket sort:
+ * - single pass of stable prefix sort on high 8 bits
+ * - and insertion sort on buckets larger than 1 element
+ */
+static void
+sort_itemIds(itemIdSort itemidbase, int nitems)
+{
+	itemIdSortData copy[Max(MaxIndexTuplesPerPage, MaxHeapTuplesPerPage)];
+#define NSPLIT 256
+#define PREFDIV (BLCKSZ / NSPLIT)
+	/* two extra elements to emulate offset on previous step */
+	uint16		count[NSPLIT + 2] = {0};
+	int			i,
+max,
+total,
+pos,
+highbits;
+
+	Assert(nitems <= MaxIndexTuplesPerPage);
+	for (i = 0; i < nitems; i++)
+	{
+		highbits = itemidbase[i].itemoff / PREFDIV;
+		count[highbits]++;
+	}
+	/* sort in decreasing order */
+	max = total = count[NSPLIT - 1];
+	for (i = NSPLIT - 2; i >= 0; i--)
+	{
+		max |= count[i];
+		total += count[i];
+		count[i] = total;
+	}
+
+	/*
+	 * count[k+1] is start of bucket k, count[k] is end of bucket k, and
+	 * count[k] - count[k+1] is length of bucket k.
+	 */
+	Assert(count[0] == nitems);
+	for (i = 0; i < nitems; i++)
+	{
+		highbits = itemidbase[i].itemoff / PREFDIV;
+		pos = count[highbits + 1];
+		count[highbits + 1]++;
+		copy[pos] = itemidbase[i];
+	}
+	Assert(count[1] == nitems);
+
+	if (max == 1)
+		goto end;
+	/*
+	 * count[k+2] is start of bucket k, count[k+1] is end of bucket k, and
+	 * count[k+1]-count[k+2] is length of bucket k.
+	 */
+	for (i = NSPLIT; i > 0; i--)
+	{
+		pg_insertion_sort(itemIdSortData,
+		  copy + count[i + 1],
+		  count[i] - count[i + 1],
+		  itemoffcompare);
+	}
+end:
+	memcpy(itemidbase, copy, sizeof(itemIdSortData) * nitems);
+}
+
+/*
  * After removing or marking some line pointers unused, move the tuples to
  * remove the gaps caused by the removed items.
  */
@@ -444,9 +523,10 @@ compactify_tuples(itemIdSort itemidbase, int nitems, Page page)
 	Offset		upper;
 	int			i;
 
-	/* sort itemIdSortData array into decreasing itemoff order */
-	qsort((char *) itemidbase, nitems, sizeof(itemIdSortData),
-		  itemoffcompare);
+	if (nitems > 48)
+		sort_itemIds(itemidbase, nitems);
+	else
+		sort_itemIds_small(itemidbase, nitems);
 
 	upper = phdr->pd_special;
 	for (i = 0; i < nitems; i++)
diff --git a/src/include/c.h b/src/include/c.h
index fba07c651f..837940d5cf 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -962,6 +962,65 @@ typedef NameData *Name;
 #define unlikely(x) ((x) != 0)

Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-17 Thread Bossart, Nathan

On 5/16/17, 11:21 PM, "Michael Paquier"  wrote:
> On Wed, May 17, 2017 at 7:56 AM, Bossart, Nathan  wrote:
>> I think this issue already exists, as this comment in get_rel_oids(…) seems 
>> to indicate:
>>
>> /*
>>  * Since we don't take a lock here, the relation might be gone, or the
>>  * RangeVar might no longer refer to the OID we look up here.  In the
>>  * former case, VACUUM will do nothing; in the latter case, it will
>>  * process the OID we looked up here, rather than the new one. Neither
>>  * is ideal, but there's little practical alternative, since we're
>>  * going to commit this transaction and begin a new one between now
>>  * and then.
>>  */
>>  relid = RangeVarGetRelid(vacrel, NoLock, false);
>>
>> With the patch applied, I believe this statement still holds true.  So if 
>> the relation disappears before we get to vacuum_rel(…), we will simply skip 
>> it and move on to the next one.  The vacuum_rel(…) code provides a WARNING 
>> in many cases (e.g. the user does not have privileges to VACUUM the table), 
>> but we seem to silently skip the table when it disappears before the call to 
>> vacuum_rel(…).
>
> Yes, that's the bits using try_relation_open() which returns NULL if
> the relation is gone. I don't think that we want VACUUM to be noisy
> about that when running it on a database.

Agreed.

>> If we added a WARNING to the effect of “skipping vacuum of  — 
>> relation no longer exists” for this case, I think what you are suggesting 
>> would be satisfied.
>
> We would do no favor by reporting nothing to the user. Without any
> information, the user triggering a manual vacuum may believe that the
> relation has been vacuumed as it was listed but that would not be the
> case.

Agreed.  Unfortunately, I think this is already possible when you specify a 
table to be vacuumed.

>> However, ANALYZE has a slight caveat.  While analyze_rel(…) silently skips 
>> the relation if it no longer exists like vacuum_rel(…) does, we do not 
>> pre-validate the columns list at all.  So, in an ANALYZE statement with 
>> multiple tables and columns specified, it’ll only fail once we get to the 
>> undefined column.  To fix this, we could add a check for the column lists 
>> near get_rel_oids(…) and adjust do_analyze_rel(…) to emit a WARNING and skip 
>> any columns that vanish in the meantime.
>>
>> Does this seem like a sane approach?
>>
>>   1. Emit WARNING when skipping if relation disappears before we get to it.
>>   2. Early in vacuum(…), check that the specified columns exist.
>
> And issue an ERROR, right?

Correct.  This means that both the relations and columns specified would cause 
an ERROR if they do not exist and a WARNING if they disappear before we can 
actually process them.

> +   RelationAndColumns *relinfo = (RelationAndColumns *)
> lfirst(relation);
> +   int per_table_opts = options | relinfo->options;  /*
> VACOPT_ANALYZE may be set per-table */
> +   ListCell *oid;
> I have just noticed this bit in your patch. So you can basically do
> something like that:
> VACUUM (ANALYZE) foo, (FULL) bar;
> Do we really want to go down to this level of control? I would keep
> things a bit more restricted as users may be confused by the different
> lock levels involved by each operation, and make use of the same
> options for all the relations listed. Opinions from others is welcome.

I agree with you here, too.  I stopped short of allowing customers to 
explicitly provide per-table options, so the example you provided wouldn’t work 
here.  This is more applicable for something like the following:

VACUUM (FREEZE, VERBOSE) foo, bar (a);

In this case, the FREEZE and VERBOSE options are used for both tables.  
However, we have a column list specified for ‘bar’, and the ANALYZE option is 
implied when we specify a column list.  So when we process ‘bar’, we need to 
apply the ANALYZE option, but we do not need it for ‘foo’.  For now, that is 
all that this per-table options variable is used for.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Pulling up more complicated subqueries

2017-05-17 Thread Heikki Linnakangas


Hi,

I spent some time staring at TPC-DS benchmark's query 6. It contains a 
somewhat complicated subquery, and most of the time spent on that query 
is currently spent on executing the subquery again and again. The 
essence of the query boils down to this:


CREATE TABLE foo (i int4, j int4);
CREATE TABLE bar (i int4, j int4);

INSERT INTO foo SELECT g%10, g FROM generate_series(1, 1) g;
INSERT INTO bar SELECT g%10, g FROM generate_series(1, 1) g;


SELECT * FROM foo
WHERE foo.j >= 1.2 *
  (SELECT avg(bar.j) FROM bar WHERE foo.i = bar.i);


The planner can pull up simpler subqueries, converting them to joins, 
but unfortunately this case is beyond its capabilities. If it was smart 
enough, it could be transformed into something like this:


SELECT * FROM foo
LEFT JOIN (SELECT avg(bar.j) as avg, bar.i FROM bar GROUP BY bar.i) as 
avg_bar ON foo.i = avg_bar.i

WHERE foo.j >= 1.2 * avg_bar.avg


There was some brief discussion on doing this kind of transformation 
back in 2011, at 
https://www.postgresql.org/message-id/4E2F163B.6060105%40gmail.com. I 
think we're in a better position to do that now than back then, thanks 
to all the other planner work that's been done since.


So how do we get there? I've identified a number of smaller steps that, 
when all put together, can handle this, as well as many other queries of 
course.


0. This simple case is already handled:

SELECT * FROM foo
WHERE foo.j IN (select bar.j from bar)

It's turned into a semi-join. The next steps will extend this basic 
capability, to handle more complicated cases.


postgres=# explain SELECT * FROM foo WHERE foo.j IN (select bar.j from bar);
   QUERY PLAN
-
 Hash Join  (cost=42.75..93.85 rows=1130 width=8)
   Hash Cond: (foo.j = bar.j)
   ->  Seq Scan on foo  (cost=0.00..32.60 rows=2260 width=8)
   ->  Hash  (cost=40.25..40.25 rows=200 width=4)
 ->  HashAggregate  (cost=38.25..40.25 rows=200 width=4)
   Group Key: bar.j
   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)
(7 rows)


1. Currently, the IN/EXISTS pullup is only done when the IN/EXISTS is a 
top-level qual. For example, this isn't currently pulled-up:


SELECT * FROM foo
WHERE foo.j IN (select bar.j from bar) OR foo.j < 10;

That's not a straight semi-join, but we could still turn it into a new 
kind of LEFT-SEMI join. A left-semi join is like a left join, in that it 
returns all rows from the left side, and NULLs for any non-matches. And 
like a semi-join, it returns only one matching row from the right-side, 
if there are duplicates. In the qual, replace the SubLink with an IS NOT 
NULL test. So logically, the above would be converted into:


SELECT * FROM foo
/* SEMI- */ LEFT JOIN (select bar.j from bar) ON foo.j = bar.j
WHERE bar.j IS NOT NULL OR foo.j < 10

This transformation can be applied to IN/EXIST type SubPlan references 
anywhere in the query, not only those in the WHERE clause.



2. Using a similar trick, we can transform subqueries that are not of 
the IN/EXISTS kind. (This isn't useful on its own, because this example 
would be handled as an InitPlan, which performs well. But it's a 
stepping stone in explaining this.)


For example:

SELECT * FROM foo
WHERE foo.j > (select avg(bar.j) from bar)

This can be implemented using yet another new join type, a LEFT-UNIQUE 
join. It's like a LEFT JOIN, but it must check that there are no 
duplicates in the right-hand-side, and throw an error if there are 
(ERROR:  more than one row returned by a subquery used as an expression).


SELECT * FROM foo
/* unique-checking */ LEFT JOIN (select avg(bar.j) as min_j from bar) as 
subq ON TRUE

WHERE foo.j > subq.min_j


3. All the previous examples are uncorrelated subqueries, but all these 
transformations can also be performed on correlated subqueries. The 
difference is that the subquery that's pulled up to the range table must 
be marked as LATERAL. For example:


SELECT * FROM foo
WHERE foo.j > (select avg(bar.j) from bar where bar.i = foo.i)

->

SELECT * FROM foo
/* UNIQUE-CHECKING */ LEFT JOIN LATERAL (select avg(bar.j) from bar 
where bar.i = foo.i) AS subq(j) ON true

WHERE foo.j > subq.j


4. The previous transformation isn't very interesting on its own, 
because the only way to implement a LATERAL join is a nested loop join, 
which is essentially the same as executing the subplan the naive way. 
But we can potentially de-correlate it, by pulling up the correlation qual:



SELECT * FROM foo
/* UNIQUE-CHECKING */ LEFT JOIN LATERAL (select bar.j, bar.i from bar 
WHERE bar.i = foo.i) AS subq(j, i) ON true

WHERE foo.j > subq.j

->

SELECT * FROM foo
/* UNIQUE-CHECKING */ LEFT JOIN (select bar.j, bar.i from bar) AS 
subq(j, i) ON subq.i = foo.i

WHERE foo.j > subq.j


This will get inlined into the parent query, getting rid of the subquery 
altogether. But even if that cannot be done, this is potentially much 
better

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

I wrote:
> BTW, I would say that the problem with -p is that somebody failed to
> understand the difference between --help and a man page.

Concretely, how about the attached?  I don't think this looks any
worse than the current layout.

regards, tom lane

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 866f88a..f2492e4 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -331,22 +331,17 @@ usage(void)
 	printf(_("\nOptions controlling the output:\n"));
 	printf(_("  -D, --pgdata=DIRECTORY receive base backup into directory\n"));
 	printf(_("  -F, --format=p|t   output format (plain (default), tar)\n"));
-	printf(_("  -r, --max-rate=RATEmaximum transfer rate to transfer data directory\n"));
-	printf(_(" (in kB/s, or use suffix \"k\" or \"M\")\n"));
-	printf(_("  -R, --write-recovery-conf\n"));
-	printf(_(" write recovery.conf for replication\n"));
+	printf(_("  -r, --max-rate=RATEmaximum transfer rate (in kB/s)\n"));
+	printf(_("  -R, --write-recovery-conf  write recovery.conf for replication\n"));
 	printf(_("  -S, --slot=SLOTNAMEreplication slot to use\n"));
 	printf(_("  --no-slot  prevent creation of temporary replication slot\n"));
-	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"));
-	printf(_(" relocate tablespace in OLDDIR to NEWDIR\n"));
-	printf(_("  -X, --wal-method=none|fetch|stream\n"));
-	printf(_(" include required WAL files with specified method\n"));
+	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR  relocate tablespace\n"));
+	printf(_("  -X, --wal-method=none|fetch|stream  set method for including WAL files\n"));
 	printf(_("  --waldir=WALDIRlocation for the write-ahead log directory\n"));
 	printf(_("  -z, --gzip compress tar output\n"));
 	printf(_("  -Z, --compress=0-9 compress tar output with given compression level\n"));
 	printf(_("\nGeneral options:\n"));
-	printf(_("  -c, --checkpoint=fast|spread\n"));
-	printf(_(" set fast or spread checkpointing\n"));
+	printf(_("  -c, --checkpoint=fast|spread  set fast or spread checkpointing\n"));
 	printf(_("  -l, --label=LABEL  set backup label\n"));
 	printf(_("  -n, --no-clean do not clean up after errors\n"));
 	printf(_("  -N, --no-sync  do not wait for changes to be written safely to disk\n"));
@@ -358,8 +353,7 @@ usage(void)
 	printf(_("  -d, --dbname=CONNSTR   connection string\n"));
 	printf(_("  -h, --host=HOSTNAMEdatabase server host or socket directory\n"));
 	printf(_("  -p, --port=PORTdatabase server port number\n"));
-	printf(_("  -s, --status-interval=INTERVAL\n"));
-	printf(_(" time between status packets sent to server (in seconds)\n"));
+	printf(_("  -s, --status-interval=SECONDS  set time between status packets\n"));
 	printf(_("  -U, --username=NAMEconnect as specified database user\n"));
 	printf(_("  -w, --no-password  never prompt for password\n"));
 	printf(_("  -W, --password force password prompt (should happen automatically)\n"));
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 56843a5..e56d7e2 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -689,18 +689,14 @@ usage(void)
 	printf(_("  -e, --end=RECPTR   stop reading at WAL location RECPTR\n"));
 	printf(_("  -f, --follow   keep retrying after reaching end of WAL\n"));
 	printf(_("  -n, --limit=N  number of records to display\n"));
-	printf(_("  -p, --path=PATHdirectory in which to find log segment files or a\n"));
-	printf(_(" directory with a ./pg_wal that contains such files\n"));
-	printf(_(" (default: current directory, ./pg_wal, PGDATA/pg_wal)\n"));
+	printf(_("  -p, --path=PATHdirectory in which to find log segment files\n"));
 	printf(_("  -r, --rmgr=RMGRonly show records generated by resource manager RMGR\n"));
 	printf(_(" use --rmgr=list to list valid resource manager names\n"));
 	printf(_("  -s, --start=RECPTR start reading at WAL location RECPTR\n"));
 	printf(_("  -t, --timeline=TLI timeline from which to read log records\n"));
-	printf(_(" (default: 1 or the value used in STARTSEG)\n"));
 	printf(_("  -V, --version  output version information, then exit\n"));
 	printf(_("  -x, --xid=XID  only show records with TransactionId XID\n"));
-	printf(_("  -z, --stats[=record]   show statistics instead of records\n"));
-	printf(_(" (optionally, show per-record statistics)\n"));
+	printf(_("  -z, --stats[=record]   show statistics or per-record statistics, not records\n"));
 	printf(_("  -?, --help show this help, then exit\

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Peter Eisentraut

On 5/17/17 11:37, Tom Lane wrote:
> I wrote:
>> BTW, I would say that the problem with -p is that somebody failed to
>> understand the difference between --help and a man page.
> 
> Concretely, how about the attached?  I don't think this looks any
> worse than the current layout.

The previous setup has been in place for years and has never been a
problem.  The alternatives are all quite a bit worse.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries

2017-05-17 Thread Remi Colinet

2017-05-13 3:53 GMT+02:00 Robert Haas :

> On Wed, May 10, 2017 at 10:05 PM, Michael Paquier
>  wrote:
> > Regarding the patch, this is way to close to the progress facility
> > already in place. So why don't you extend it for the executor?
>
> I don't think that's a good idea.  The existing progress facility
> advertises a fixed number of counters with a command-type-specific
> interpretation, but for *query* progress reporting, we really need a
> richer model.  A query plan can contain an unbounded number of
> executor nodes and Remi's idea is, quite reasonably, to report
> something about each one.
>

Oracle's way to report progress of long queries is to use a
v$session_longops view which collects progress reports for any backend
which has been running a SQL query for more than 6 seconds. But it's
probably even better to use a SQL function which allows for instance to
provide a specific pid parameter in order to limit the report to a given
backend. Users are expected to be focused on one SQL query which is long to
complete.

>
> From a design point of view, I think a patch like this has to clear a
> number of hurdles.  It needs to:
>
> 1. Decide what data to expose.  The sample output looks reasonable in
> this case, although the amount of code churn looks really high.
>

I will probably remove:
- the verbose option (PROGRESS VERBOSE PID) which exposes very low details
for tapes used by Sort nodes and by HasJoinTable nodes.
- the JSON, XML output formats because the command will be replaced by a
SQL function returning a row set.
This should make the code much smaller and avoid any change to the EXPLAIN
code (code sharing so far).

> 2. Find a way to advertise the data that it exposes.  This patch looks
> like it allocates a half-MB of shared memory per backend and gives up
> if that's not enough.
>

May be, it needs a dynamic allocation of shared memory instead of static
allocation at start of server. This would spare shared memory and avoid
failure when the report is larger than the shared memory buffer pre
allocated so far.

>
> 3. Find a way to synchronize the access to the data that it exposes.
> This patch ignores that problem completely, so multiple progress
> report commands running at the same time will behave unpredictably.
>
> So far, I've been using a single lock in shared memory to synchronize the
requests for SQL progress report.

LWLockAcquire(ProgressLock, LW_EXCLUSIVE);
...
SendProcSignal(stmt->pid, PROCSIG_PROGRESS, bid);
...
LWLockRelease(ProgressLock);

It's expected that a few dba would use such feature at the same time. The
lock would need to be unlocked if the current backend fails.

I've also realized that access control for progress report needs to be
added.

Thx & regards
Remi

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Peter Eisentraut

On 5/17/17 10:14, Tom Lane wrote:
> What I was concerned about was that pgindent will reindent the second
> line so that it's impossible to tell whether the spacing is correct.

pgindent moving string continuations to the left is a completely
terrible behavior anyway and we should look into changing that.  Just
look at the mess it makes out of SELECT queries in pg_dump.c.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improvement in log message of logical replication worker

2017-05-17 Thread Peter Eisentraut

On 5/17/17 05:15, Masahiko Sawada wrote:
> BTW, why should max_replication_slots be set more than 0 even on the
> subscriber side? It's documented but I could not understand reason.

Because that setting also controls replication origin tracking slots.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries

2017-05-17 Thread Remi Colinet

2017-05-13 14:38 GMT+02:00 Amit Kapila :

> On Wed, May 10, 2017 at 10:10 PM, Remi Colinet 
> wrote:
> >
> > Parallel queries can also be monitored. The same mecanism is used to
> monitor
> > child workers with a slight difference: the main worker requests the
> child
> > progression directly in order to dump the whole progress tree in shared
> > memory.
> >
>
> What if there is any error in the worker (like "out of memory") while
> gathering the statistics?  It seems both for workers as well as for
> the main backend it will just error out.  I am not sure if it is a
> good idea to error out the backend or parallel worker as it will just
> end the query execution.

The handling of progress report starts by the creation of a MemoryContext
attached to CurrentMemoryContext. Then, few memory (few KB) is allocated.
Meanwhile, the handling of progress report could indeed exhaust memory and
fail the backend request. But, in such situation, the backend could also
have fail even without any progress request.

> Also, even if it is okay, there doesn't seem
> to be a way by which a parallel worker can communicate the error back
> to master backend, rather it will just exit silently which is not
> right.
>

If a child worker fails, it will not respond to the main backend request.
The main backend will follow up it execution after a 5 seconds timeout (GUC
param to be added may be). In which case, the report would be partially
filled.

If the main backend fails, the requesting backend will have a response such
as:

test=# PROGRESS 14611;
 PLAN PROGRESS

(1 row)

test=#

and the child workers will log their response to the shared memory. This
response will not be collected by the main backend which has failed.

Thx & regards
Remi

> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>

Re: [HACKERS] [COMMITTERS] pgsql: Tag refs/tags/REL_10_BETA1 was created

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 8:04 AM, Andrew Dunstan
 wrote:
>> When we get to v100, we'll need some other hack to make it work ...
>> but I plan to be safely dead by then.
>
> Me too. Since posterity will be deprived of both of us let's note that
> the same hack will work, we'll just need two underscores.

That cure sounds worse than the disease, but I guess we can leave the
decision to posterity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v2] Progress command to monitor progression of long running SQL queries

2017-05-17 Thread Remi Colinet

2017-05-16 8:17 GMT+02:00 Michael Paquier :

> On Sat, May 13, 2017 at 10:53 AM, Robert Haas 
> wrote:
> > On Wed, May 10, 2017 at 10:05 PM, Michael Paquier
> >  wrote:
> >> Regarding the patch, this is way to close to the progress facility
> >> already in place. So why don't you extend it for the executor?
> >
> > I don't think that's a good idea.  The existing progress facility
> > advertises a fixed number of counters with a command-type-specific
> > interpretation, but for *query* progress reporting, we really need a
> > richer model.  A query plan can contain an unbounded number of
> > executor nodes and Remi's idea is, quite reasonably, to report
> > something about each one.
>
> I see what you're coming at. As st_progress_param is upper-bounded,
> what should be actually used as report structure is a tree made of
> nodes with one planner node attached to each node of the report tree.
> Then the reporting facility should need to make persistent the data
> reported.
>  which is in this case a set of planner nodes, being able to also make
> persistent the changes on the tree reported to the user. Then when a
> backend reports its execution status, what it does is just updating a
> portion of the tree in shared memory. Lock contention may be a problem
> though, perhaps things could be made atomic?
>

In order to report progress of a SQL query, the progress request handler
walks the execution tree of the query. No new node is ever created. The
execution tree has been slightly modified to collect the number of
rows/blocks returned during the executor run. This collection of the
progress request state is only done when it is requested by a user, and it
is dump at this moment in shared memory.

As long as nobody asks for a progress report, there is no overhead unless a
few counters updated during executor run.

Regards


>
> > From a design point of view, I think a patch like this has to clear a
> > number of hurdles.  It needs to:
> >
> > 1. Decide what data to expose.  The sample output looks reasonable in
> > this case, although the amount of code churn looks really high.
> >
> > 2. Find a way to advertise the data that it exposes.  This patch looks
> > like it allocates a half-MB of shared memory per backend and gives up
> > if that's not enough.
>
> Perhaps DSM? It is not user-friendly to fail sporadically...
>
> > 3. Find a way to synchronize the access to the data that it exposes.
> > This patch ignores that problem completely, so multiple progress
> > report commands running at the same time will behave unpredictably.
>
> Yeah..
> --
> Michael
>

Re: [HACKERS] COPY FROM STDIN behaviour on end-of-file

2017-05-17 Thread Tom Lane

Thomas Munro  writes:
> On Wed, May 17, 2017 at 2:39 PM, Tom Lane  wrote:
>> Thanks for checking.  So that's two major platforms where it works "as
>> expected" already.

> Ah... the reason this is happening is that BSD-derived fread()
> implementations return immediately if the EOF flag is set[1], but
> others do not.

Hah, good detective work.  I tried this on the lone SysV-derived box
I have (ancient HPUX), and did not see the problem, so that seems to
confirm the comment you found that says this is a SysV-tradition vs
BSD-tradition thing.

>> If we're going
>> to go out of our way to make it work, should we mention it in psql-ref?

> I went looking for the place to put that and found that it already says:

> For \copy ... from stdin, data rows are read from the same
> source that issued the command, continuing until \.
> is read or the stream reaches EOF.

Yeah, it seems like that's clear enough already; I don't feel a need to
complicate it.

Another thing that would be nice is a regression test, but I don't see
any way to do that in our standard test environments.  I could imagine
building a test rig that fires up psql through a PTY, but making it
portable is a daunting prospect, and probably not worth the trouble.

> Here's a version incorporating your other suggestions and a comment to 
> explain.

I editorialized on the comment a bit and pushed it.  Thanks for the
report and patch!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pulling up more complicated subqueries

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 11:08 AM, Heikki Linnakangas  wrote:
> That's not a straight semi-join, but we could still turn it into a new kind
> of LEFT-SEMI join. A left-semi join is like a left join, in that it returns
> all rows from the left side, and NULLs for any non-matches. And like a
> semi-join, it returns only one matching row from the right-side, if there
> are duplicates. In the qual, replace the SubLink with an IS NOT NULL test.
...
> This can be implemented using yet another new join type, a LEFT-UNIQUE join.
> It's like a LEFT JOIN, but it must check that there are no duplicates in the
> right-hand-side, and throw an error if there are (ERROR:  more than one row
> returned by a subquery used as an expression).

It seems like we might want to split what is currently called JoinType
into two separate things -- one that is INNER/LEFT/RIGHT/FULL and the
other that says what to do about multiple matches, which could be that
they are expected, they are to be ignored (as in your LEFT-SEMI case),
or they should error out (as in your LEFT-UNIQUE case).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 3:06 AM, Tsunakawa, Takayuki
 wrote:
> What do you think of the following cases?  Don't you want to connect to other 
> servers?
>
> * The DBA shuts down the database.  The server takes a long time to do 
> checkpointing.  During the shutdown checkpoint, libpq tries to connect to the 
> server and receive an error "the database system is shutting down."
>
> * The former primary failed and now is trying to start as a standby, catching 
> up by applying WAL.  During the recovery, libpq tries to connect to the 
> server and receive an error "the database system is performing recovery."
>
> * The database server crashed due to a bug.  Unfortunately, the server takes 
> unexpectedly long time to shut down because it takes many seconds to write 
> the stats file (as you remember, Tom-san experienced 57 seconds to write the 
> stats file during regression tests.)  During the stats file write, libpq 
> tries to connect to the server and receive an error "the database system is 
> shutting down."
>
> These are equivalent to server failure.  I believe we should prioritize 
> rescuing errors during operation over detecting configuration errors.

Yeah, you have a point.  I'm willing to admit that we may have defined
the behavior of the feature incorrectly, provided that you're willing
to admit that you're proposing a definition change, not just a bug
fix.

Anybody else want to weigh in with an opinion here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Peter Eisentraut  writes:
> On 5/17/17 11:37, Tom Lane wrote:
>> Concretely, how about the attached?  I don't think this looks any
>> worse than the current layout.

> The previous setup has been in place for years and has never been a
> problem.  The alternatives are all quite a bit worse.

No, the previous setup hasn't been "in place for years".  These programs
were only NLS-ified last fall.  Before that the code looked like, eg,

printf("  -z, --stats[=record]   show statistics instead of records\n");
printf(" (optionally, show per-record 
statistics)\n");

so that there weren't string continuations for pgindent to fool with.

I'm not really convinced that having usage() print two- or three-line
switch descriptions is "quite a bit better" than what I suggested.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Peter Eisentraut  writes:
> On 5/17/17 10:14, Tom Lane wrote:
>> What I was concerned about was that pgindent will reindent the second
>> line so that it's impossible to tell whether the spacing is correct.

> pgindent moving string continuations to the left is a completely
> terrible behavior anyway and we should look into changing that.  Just
> look at the mess it makes out of SELECT queries in pg_dump.c.

I'd be on board with that, perhaps, but does anyone here have enough
insight into bsd_indent to fix that without breaking other stuff?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Alvaro Herrera

Tom Lane wrote:
> Peter Eisentraut  writes:
> > On 5/17/17 11:37, Tom Lane wrote:
> >> Concretely, how about the attached?  I don't think this looks any
> >> worse than the current layout.
> 
> > The previous setup has been in place for years and has never been a
> > problem.  The alternatives are all quite a bit worse.
> 
> No, the previous setup hasn't been "in place for years".  These programs
> were only NLS-ified last fall.

We use the same technique in other places such as pg_dump's help() too.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Tom Lane

Robert Haas  writes:
> Yeah, you have a point.  I'm willing to admit that we may have defined
> the behavior of the feature incorrectly, provided that you're willing
> to admit that you're proposing a definition change, not just a bug
> fix.

> Anybody else want to weigh in with an opinion here?

I'm not really on board with "try each server until you find one where
this dbname+username+password combination works".  That's just a recipe
for trouble, especially the password angle.

I think it's a good point that there are certain server responses that
we should take as equivalent to "server down", but by the same token
there are responses that we should not take that way.

I suggest that we need to conditionalize the decision based on what
SQLSTATE is reported.  Not sure offhand if it's better to have a whitelist
of SQLSTATEs that allow failing over to the next server, or a blacklist of
SQLSTATEs that don't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 10beta1 sequence regression failure on sparc64

2017-05-17 Thread Christoph Berg

The sequence regression tests are failing on Debian/sparc64:

 sequence ... FAILED
 polymorphism ... ok
 rowtypes ... ok
 returning... ok
 largeobject  ... ok
 with ... ok
 xml  ... ok
test identity ... ok
test event_trigger... ok
test stats... ok
== shutting down postmaster   ==


 1 of 178 tests failed. 


The differences that caused some tests to fail can be viewed in the
file "/<>/build/src/test/regress/regression.diffs".  A copy of the 
test summary that you see
above is saved in the file 
"/<>/build/src/test/regress/regression.out".

GNUmakefile:130: recipe for target 'check' failed
make[2]: *** [check] Error 1
make[2]: Leaving directory '/<>/build/src/test/regress'
Makefile:28: recipe for target 'check-regress-recurse' failed
make[1]: *** [check-regress-recurse] Error 2
make[1]: Leaving directory '/<>/build/src/test'
GNUmakefile:69: recipe for target 'check-world-src/test-recurse' failed
make: *** [check-world-src/test-recurse] Error 2
make: Leaving directory '/<>/build'
 build/src/test/regress/regression.diffs 
*** 477,485 
  
--+-++---+---+-+---+-+--+-+---+--
   regression   | public  | sequence_test10| smallint  |
16 |   2 | 0 | 1   | -2 
  | 32767   | 1 | NO
   regression   | public  | sequence_test11| integer   |
32 |   2 | 0 | 1   | 1  
  | 2147483647  | 1 | NO
!  regression   | public  | sequence_test12| integer   |
32 |   2 | 0 | -1  | 
-2147483648  | -1  | -1| NO
!  regression   | public  | sequence_test13| integer   |
32 |   2 | 0 | -32768  | 
-2147483648  | 2147483647  | 1 | NO
!  regression   | public  | sequence_test14| integer   |
32 |   2 | 0 | 32767   | 
-2147483648  | 2147483647  | -1| NO
   regression   | public  | sequence_test2 | bigint|
64 |   2 | 0 | 32  | 5  
  | 36  | 4 | YES
   regression   | public  | sequence_test3 | bigint|
64 |   2 | 0 | 1   | 1  
  | 9223372036854775807 | 1 | NO
   regression   | public  | sequence_test4 | bigint|
64 |   2 | 0 | -1  | 
-9223372036854775808 | -1  | -1| NO
--- 484,490 
  
--+-++---+---+-+---+-+--+-+---+--
   regression   | public  | sequence_test10| smallint  |
16 |   2 | 0 | 1   | -2 
  | 32767   | 1 | NO
   regression   | public  | sequence_test11| integer   |
32 |   2 | 0 | 1   | 1  
  | 2147483647  | 1 | NO
!  regression   | public  | sequence_test13| smallint  |
16 |   2 | 0 | -32768  | -32768 
  | 32767   | 1 | NO
   regression   | public  | sequence_test2 | bigint|
64 |   2 | 0 | 32  | 5  
  | 36  | 4 | YES
   regression   | public  | sequence_test3 | bigint|
64 |   2 | 0 | 1   | 1  
  | 9223372036854775807 | 1 | NO
   regression   | public  | sequence_test4 | bigint|
64 |   2 | 0 | -1  | 
-9223372036854775808 | -1  | -1| NO
***
*** 487,500 
   regression   | public  | sequence_test6 | smallint  |
16 |   2 | 0 | 1   | 1  
  | 32767   | 1 | NO

Re: [HACKERS] remove unnecessary flag has_null from PartitionBoundInfoData

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 6:53 AM, Jeevan Ladhe
 wrote:
> As discussed in default partition thread[1], here is the patch to remove
> has_null from PartitionBoundInfoData structure.
> Basically flag has_null is not needed and null_index can be checked if the
> current bound is having a null value or not.
>
> For simplicity of future use, in attached patch I have introduced a macro
> that
> would return TRUE if the given bound has null.

This seems like a good cleanup, but:

- It makes no sense to put a macro definition in a header file when
the corresponding structure definition is private to a specific .c
file.

- You failed to update the header comment for PartitionBoundInfoData
which mentions the has_null flag.

- The comment for null_index also seems like it should be updated to
mention that -1 is the sentinel value in general, not just for
range-partitioned tables.

I committed this with fixes for those issues, plus I renamed the macro
to partition_bound_accepts_nulls, which I think is more clear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Stephen Frost

Tom, Robert,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > Yeah, you have a point.  I'm willing to admit that we may have defined
> > the behavior of the feature incorrectly, provided that you're willing
> > to admit that you're proposing a definition change, not just a bug
> > fix.
> 
> > Anybody else want to weigh in with an opinion here?
> 
> I'm not really on board with "try each server until you find one where
> this dbname+username+password combination works".  That's just a recipe
> for trouble, especially the password angle.

Agreed.

> I think it's a good point that there are certain server responses that
> we should take as equivalent to "server down", but by the same token
> there are responses that we should not take that way.

Right.

> I suggest that we need to conditionalize the decision based on what
> SQLSTATE is reported.  Not sure offhand if it's better to have a whitelist
> of SQLSTATEs that allow failing over to the next server, or a blacklist of
> SQLSTATEs that don't.

No particular comment on this.  I do wonder about forward/backwards
compatibility in such lists and if SQLSTATE really covers all
cases/distinctions which are interesting when it comes to making this
decision.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-05-17 Thread Surafel Temesgen

Sorry for being very late. I also think guc version of the patch can be
acceptable and useful.

I modified the patch as such and added to commitfest 2017-07.


Regards



Surafel

On Sat, Mar 4, 2017 at 10:24 AM, Robert Haas  wrote:

> On Tue, Feb 28, 2017 at 7:34 PM, Tom Lane  wrote:
> > Surafel Temesgen  writes:
> >> This assignment is on todo list and has a benefit of providing an
> >> additional defense against SQL-injection attacks.
> >
> > This is on the todo list?  Really?  It seems unlikely to be worth the
> > backwards-compatibility breakage.  I certainly doubt that we could
> > get away with unconditionally rejecting such cases with no "off" switch,
> > as you have here.
> >
> >> Previous mailing list discussion is here
> >> 
> >
> > That message points out specifically that we *didn't* plan to do this.
> > Perhaps back then (ten years ago) we could have gotten away with the
> > compatibility breakage, but now I doubt it.
>
> Probably true, but I bet it would be OK to add this as an optional
> behavior, controlled by a GUC.  I know behavior-changing GUCs aren't
> good, but this seems like a sufficiently-peripheral behavior that it
> would be OK.   Extensions, for example, wouldn't break, because
> they're executing inside the database, not through libpq.  Stored
> procedures wouldn't break either.  The only real risk is that the
> user's application itself might break, but there's an easy solution to
> that problem.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


disallow-multiple-queries-2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Tom Lane

Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I suggest that we need to conditionalize the decision based on what
>> SQLSTATE is reported.  Not sure offhand if it's better to have a whitelist
>> of SQLSTATEs that allow failing over to the next server, or a blacklist of
>> SQLSTATEs that don't.

> No particular comment on this.  I do wonder about forward/backwards
> compatibility in such lists and if SQLSTATE really covers all
> cases/distinctions which are interesting when it comes to making this
> decision.

If the server is reporting the same SQLSTATE for server-down types
of conditions as for server-up, then that's a bug and we need to change
the SQLSTATE assigned to one case or the other.  The entire point of
SQLSTATE is that it should generally capture distinctions as finely
as client software is likely to be interested in.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 12:48 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Yeah, you have a point.  I'm willing to admit that we may have defined
>> the behavior of the feature incorrectly, provided that you're willing
>> to admit that you're proposing a definition change, not just a bug
>> fix.
>
>> Anybody else want to weigh in with an opinion here?
>
> I'm not really on board with "try each server until you find one where
> this dbname+username+password combination works".  That's just a recipe
> for trouble, especially the password angle.

Sure, I know what *your* opinion is.  And I'm somewhat inclined to
agree, but not to the degree that I don't think we should hear what
other people have to say.

> I suggest that we need to conditionalize the decision based on what
> SQLSTATE is reported.  Not sure offhand if it's better to have a whitelist
> of SQLSTATEs that allow failing over to the next server, or a blacklist of
> SQLSTATEs that don't.

Urgh.  There are two things I don't like about that.  First, it's a
major redesign of this feature at the 11th hour.  Second, if we can't
even agree on the general question of whether all, some, or no server
errors should cause a retry, the chances of agreeing on which SQL
states to include in the retry loop are probably pretty low.  Indeed,
there might not be one answer that will be right for everyone.

One good argument for leaving this alone entirely is that this feature
was committed on November 3rd and this thread began on May 12th.  If
there was ample time before feature freeze to question the design and
nobody did, then I'm not sure why we should disregard the freeze to
start whacking it around now, especially on the strength of one
complaint.  It may be that after we get some field experience with
this the right thing to do will become clearer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> No, the previous setup hasn't been "in place for years".  These programs
>> were only NLS-ified last fall.

> We use the same technique in other places such as pg_dump's help() too.

Meh.  Well, I reverted the changes in question while we discuss it.

Changing the pgindent rule for such cases sounds kind of promising,
but will anyone pursue it?

(In any case, we should probably go ahead with the scheduled pgindent
run, and then we can consider a new run with new rules later; that
will ease seeing exactly what changes a new rule would make.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> One good argument for leaving this alone entirely is that this feature
> was committed on November 3rd and this thread began on May 12th.  If
> there was ample time before feature freeze to question the design and
> nobody did, then I'm not sure why we should disregard the freeze to
> start whacking it around now, especially on the strength of one
> complaint.  It may be that after we get some field experience with
> this the right thing to do will become clearer.

I am not particularly convinced by this argument.  As much as we hope
that committers have worked with a variety of people with varying
interests and that individuals who are concerned about such start
testing just as soon as something is committed, that, frankly, isn't how
the world really works, based on my observations, at least.

The point of this period of time between feature freeze and actual
release is, more-or-less, to figure out if the solution we've reached
actually is a good one, and if not, to do something about it.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] 10beta1 sequence regression failure on sparc64

2017-05-17 Thread Tom Lane

Christoph Berg  writes:
> The sequence regression tests are failing on Debian/sparc64:
>  ...
> (This is only the last 100 lines of regression.diffs, if it helps I
> can try rebuilding and grabbing the full file.)

Yes please.  What we can see here looks to be just fallout from
a failure earlier in the run.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Alvaro Herrera

Tom Lane wrote:

> Changing the pgindent rule for such cases sounds kind of promising,
> but will anyone pursue it?

We have someone who has studied the BSD indent code and even sent us
patches to fix quite a few bugs, but we've largely ignored his efforts
so far.  I propose we take that indent as part of our repo, and patch it
to our preferences.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Tom Lane

Stephen Frost  writes:
> * Robert Haas (robertmh...@gmail.com) wrote:
>> One good argument for leaving this alone entirely is that this feature
>> was committed on November 3rd and this thread began on May 12th.  If
>> there was ample time before feature freeze to question the design and
>> nobody did, then I'm not sure why we should disregard the freeze to
>> start whacking it around now, especially on the strength of one
>> complaint.  It may be that after we get some field experience with
>> this the right thing to do will become clearer.

> I am not particularly convinced by this argument.  As much as we hope
> that committers have worked with a variety of people with varying
> interests and that individuals who are concerned about such start
> testing just as soon as something is committed, that, frankly, isn't how
> the world really works, based on my observations, at least.

> The point of this period of time between feature freeze and actual
> release is, more-or-less, to figure out if the solution we've reached
> actually is a good one, and if not, to do something about it.

Sure, but part of the point of beta testing is to get user feedback.

I agree with Robert's point that major redesign of the feature on the
basis of one complaint isn't necessarily the way to go.  Since the
existing behavior is already out in beta1, let's wait and see if anyone
else complains.  We don't need to fix it Right This Instant.

Maybe add this to the list of open issues to reconsider mid-beta?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread David G. Johnston

On Wed, May 17, 2017 at 12:06 AM, Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: pgsql-hackers-ow...@postgresql.org
> > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> > Who is right is a judgement call, but I don't think it's self-evident
> that
> > users want to ignore anything and everything that might have gone wrong
> > with the connection to the first server, rather than only those things
> which
> > resemble a down server.  It seems quite possible to me that if we had
> defined
> > it as you are proposing, somebody would now be arguing for a behavior
> change
> > in the other direction.
>
> Judgment call... so, I understood that it's a matter of choosing between
> helping to detect configuration errors early or service continuity.

This is how I've been reading this thread and I'm tending to agree with
prioritizing service continuity over configuration error detection.  As a
client if I have an alternative that ends up working I don't really care
whose fault it is that the earlier options weren't.  I don't have enough
experience to think up plausible scenarios here but I'm sold on the theory.

David J.

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Stephen Frost

Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I agree with Robert's point that major redesign of the feature on the
> basis of one complaint isn't necessarily the way to go.  Since the
> existing behavior is already out in beta1, let's wait and see if anyone
> else complains.  We don't need to fix it Right This Instant.

Fair enough.

> Maybe add this to the list of open issues to reconsider mid-beta?

Works for me.

Thanks!

Stephen


signature.asc
Description: Digital signature

pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Tom Lane

Alvaro Herrera  writes:
> We have someone who has studied the BSD indent code and even sent us
> patches to fix quite a few bugs, but we've largely ignored his efforts
> so far.  I propose we take that indent as part of our repo, and patch it
> to our preferences.

Messing with pgindent didn't seem all that high-priority while we were
in development mode, and it would also have been pretty painful to have
a pgindent that wanted to revisit settled decisions when you just wanted
it to fix new code.  Maybe now (ie, over the next few weeks) is a good
time to pursue it, before we start a new round of code development.

Not sure about actually incorporating it into our repo.  Doing so would
make it easier for people to use, for sure, and the license seems to be
regular 3-clause BSD, so that angle is OK.  But do we want to be carrying
around another 150K of source code?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

2017-05-17 Thread Tels

Moin,

On Wed, May 17, 2017 12:34 pm, Robert Haas wrote:
> On Wed, May 17, 2017 at 3:06 AM, Tsunakawa, Takayuki
>  wrote:
>> What do you think of the following cases?  Don't you want to connect to
>> other servers?
>>
>> * The DBA shuts down the database.  The server takes a long time to do
>> checkpointing.  During the shutdown checkpoint, libpq tries to connect
>> to the server and receive an error "the database system is shutting
>> down."
>>
>> * The former primary failed and now is trying to start as a standby,
>> catching up by applying WAL.  During the recovery, libpq tries to
>> connect to the server and receive an error "the database system is
>> performing recovery."
>>
>> * The database server crashed due to a bug.  Unfortunately, the server
>> takes unexpectedly long time to shut down because it takes many seconds
>> to write the stats file (as you remember, Tom-san experienced 57 seconds
>> to write the stats file during regression tests.)  During the stats file
>> write, libpq tries to connect to the server and receive an error "the
>> database system is shutting down."
>>
>> These are equivalent to server failure.  I believe we should prioritize
>> rescuing errors during operation over detecting configuration errors.
>
> Yeah, you have a point.  I'm willing to admit that we may have defined
> the behavior of the feature incorrectly, provided that you're willing
> to admit that you're proposing a definition change, not just a bug
> fix.
>
> Anybody else want to weigh in with an opinion here?

Hm, to me the feature needs to be reliable (for certain values of
reliable) to be usefull.

Consider that you have X hosts (rendundancy), and a lot of applications
that want a stable connection to the one that (still) works, whichever
this is.

You can then either:

1. make one primary, the other standby(s) and play DNS tricks or similiar
to make it appear that there is only one working host, and have all apps
connect to the "one host" (and reconnect to it upon failure)

2. let each app try each host until it finds a working one, if the
connection breaks, retry with the next host

3. or use libpq and let it try the hosts for you.

However, if I understand it correctly, #3 only works reliable in certain
cases (e.g. host down), but not if it is "sort of down". In that case each
app would again need code to retry different hosts until it finds a
working one, instead of letting libpq do the work.

That sound hard to deploy #3 in praxis, as you might easily just code up
#1 or #2 and call it a day.

All the best,

Tels

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] remove unnecessary flag has_null from PartitionBoundInfoData

2017-05-17 Thread Jeevan Ladhe

> I committed this with fixes for those issues, plus I renamed the macro
> to partition_bound_accepts_nulls, which I think is more clear.


Thanks Robert.

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Bruce Momjian

On Wed, May 17, 2017 at 01:06:26PM -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:
> >> No, the previous setup hasn't been "in place for years".  These programs
> >> were only NLS-ified last fall.
> 
> > We use the same technique in other places such as pg_dump's help() too.
> 
> Meh.  Well, I reverted the changes in question while we discuss it.

Are we ready for a pgindent run now?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Piotr Stefaniak

On 2017-05-17 19:16, Alvaro Herrera wrote:
> Tom Lane wrote:
> 
>> Changing the pgindent rule for such cases sounds kind of promising,
>> but will anyone pursue it?
> 
> We have someone who has studied the BSD indent code and even sent us
> patches to fix quite a few bugs, but we've largely ignored his efforts
> so far.  I propose we take that indent as part of our repo, and patch it
> to our preferences.

That, I assume, would be me. Coincidentally, I'm about to push my fixes
upstream (FreeBSD). Before that happens, my changes can be obtained from
https://github.com/pstef/freebsd_indent and tested, if anyone wishes.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

2017-05-17 Thread Robert Haas

On Tue, May 16, 2017 at 4:25 PM, Jeff Davis  wrote:
> Why can't hash partitions be stored in tables the same way as we do TOAST?
> That should take care of the naming problem.

Hmm, yeah, something like that could be done, but every place where
you are currently allowed to refer to a partition by name would have
to be be changed to accept some other syntax for specifying the
partition.  Even with all the things you propose to disallow, things
like CLUSTER, VACUUM, ANALYZE, etc. would still have to be accepted on
individual partitions.  I suspect even CREATE INDEX and DROP INDEX
would need to be accepted on individual partitions, because an index
on one partition somehow becomes bloated while the corresponding
indexes on other partitions are OK, you'll want to create a
replacement index concurrently and drop the old one.  Of course, for
similar reasons, you'd need some way for \d on the parent to display
information on indexes on all the children, and all of that output
would have to frobbed to use whatever syntax is now required, in lieu
of a name, to use an individual partition.  Error messages would have
to be adjusted in probably quite a few places to use the new notation,
too.  And on and on.  It's not impossible to do, but we could end up
chasing down loose ends for a very long time.

Beyond that, I think it's a bad idea to make hash partitions behave
completely differently from list and range partitions.  That's a lot
of code extra code to maintain, and a lot of extra notional complexity
for users, for really not a lot of benefit.  I think we're taking a
significant but not overwhelming problem -- our current hash functions
aren't portable -- and through a chain of logical links eventually
ending up with the conclusion that the design of partitioning needs to
be totally overhauled.  I want to resist that conclusion.  I'm not
saying that the problem isn't a problem, or that there's not some
logic to each step in the chain, but it's not that hard to blow a
small problem up into a huge one by assuming the worst possible
consequences or the tightest possible requirements at each step.
http://tvtropes.org/pmwiki/pmwiki.php/Main/ForWantOfANail is not an
argument for stricter regulation of the blacksmith industry.

>> If Java has portable hash functions, why can't we?
>
> Java standardizes on a particular unicode encoding (utf-16). Are you
> suggesting that we do the same? Or is there another solution that I am
> missing?

Well, I've already said several times (and Peter Eisentraut has
agreed) that we don't really need the hash functions to be portable
across encodings.  I think there are at least three good arguments for
that position.

First, as Peter Geoghegan points out, the word is increasingly
standardizing on Unicode, and that trend seems likely to continue.  I
strongly suspect that UTF-8 is the most common database encoding by a
wide margin.  There may occasionally be reasons to avoid it if, for
example, you're using one of the Eastern languages that doesn't play
entirely nicely with UTF-8, or if you happen to be storing a large
number of characters that can be represented in a single byte in some
other encoding but which require multiple bytes in UTF-8, but for an
awful lot of people UTF-8 just works and there's no need to think any
further.  So, a lot of people will never hit the problem of needing to
migrate a database between encodings because they'll just use UTF-8.

Second, if the previous argument turns out to be wrong and the world
abandons UTF-8 in favor of some new and better system (UTF-9?), or if
users frequently want to make some other encoding conversion like
Tom's original example of LATIN1 -> UTF-8, we've already got a
proposed workaround for that case which seems like it will work just
fine.  Just dump your data with pg_dump
--insert-hash-partitioned-data-into-parent and reload on the new
system.  This isn't absolutely guaranteed to work if you've done
something silly that will make the load of a particular row work on
one partition and fail on some other one, but you probably won't do
that because it would be dumb.  Also, it will be a bit slower than a
regular dump-and-reload cycle because tuple routing isn't free.
Neither of these problems really sound very bad.  If we're going to
start fixing things that could cause database migrations/upgrades to
occasionally fail in corner cases or run more slowly than expected,
there's a long list of things that you can do in your DDL that will
make pg_upgrade bomb out, and many of them are things that bite users
with some regularity (e.g. tablespaces inside the data directory or
other tablespaces, dependencies on system objects that are changed in
the new version, ALTER USER .. SET ROLE).  For whatever reason, we
haven't viewed those warts as really high-priority items in need of
fixing; in some cases, instead of actually trying to improve
usability, we've all but mocked the people reporting those issues for
having the temerity to do configur

Re: [HACKERS] [POC] hash partitioning

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 1:41 AM, Ashutosh Bapat
 wrote:
>> Fixed in the attached version;  used "hash partition remainder must be
>> greater than or equal to 0" instead.
>
> I would suggest "non-zero positive", since that's what we are using in
> the documentation.

Well, that's not very good terminology, because zero is not a positive
number.  Existing error messages seem to use phrasing such as "THING
must be a positive integer" when zero is not allowed or "THING must be
a non-negative integer" when zero is allowed.  For examples, do git
grep errmsg.*positive or git grep errmsg.*negative.

> In partition_bounds_equal(), please add comments explaining why is it safe to
> check just the indexes? May be we should add code under assertion to make sure
> that the datums are equal as well. The comment could be something
> like, "If two partitioned tables have different greatest moduli, their
> partition schemes don't match. If they have same greatest moduli, and
> all remainders have different indexes, they all have same modulus
> specified and the partitions are ordered by remainders, thus indexes
> array will be an identity i.e. index[i] = i. If the partition
> corresponding to a given remainder exists, it will have same index
> entry for both partitioned tables or if it's missing it will be -1.
> Thus if indexes array matches, corresponding datums array matches. If
> there are multiple remainders corresponding to a given partition,
> their partitions are ordered by the lowest of the remainders, thus if
> indexes array matches, both of the tables have same indexes arrays, in
> both the tables remainders corresponding to multiple partitions all
> have same indexes and thus same modulus. Thus again if the indexes are
> same, datums are same.".

That seems quite long.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Bruce Momjian  writes:
> Are we ready for a pgindent run now?

Yes, might as well do it.  Some of these discussions might lead to
a re-run with a newer version of pgindent, but it would be good to
have a clean tree to start from.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera  writes:
> > We have someone who has studied the BSD indent code and even sent us
> > patches to fix quite a few bugs, but we've largely ignored his efforts
> > so far.  I propose we take that indent as part of our repo, and patch it
> > to our preferences.
> 
> Messing with pgindent didn't seem all that high-priority while we were
> in development mode, and it would also have been pretty painful to have
> a pgindent that wanted to revisit settled decisions when you just wanted
> it to fix new code.  Maybe now (ie, over the next few weeks) is a good
> time to pursue it, before we start a new round of code development.

Sounds reasonable.

> Not sure about actually incorporating it into our repo.  Doing so would
> make it easier for people to use, for sure, and the license seems to be
> regular 3-clause BSD, so that angle is OK.  But do we want to be carrying
> around another 150K of source code?

The alternatives are

1. rely on the dead code we've been using so far (the old BSD indent
patched with our Pg-specific tweaks), or

2. rely on someone else's upstream code -- in this case, FreeBSD's as
patched by Piotr.

Now that Piotr's is about to find a home, perhaps it's okay for us to
rely on that one.  I just didn't like the idea of running something from
Piotr's personal repo.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Stephen Frost

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Bruce Momjian  writes:
> > Are we ready for a pgindent run now?
> 
> Yes, might as well do it.  Some of these discussions might lead to
> a re-run with a newer version of pgindent, but it would be good to
> have a clean tree to start from.

+1.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PROVE_FLAGS

2017-05-17 Thread Robert Haas

On Tue, May 16, 2017 at 9:20 PM, Andrew Dunstan
 wrote:
> Inheriting variables from the environment is a part of make by design.
> We have PG_PROVE_FLAGS for our own forced settings.

I don't buy this argument.  We've had previous cases where we've gone
through and added -X to psql invocations in the regression tests
because, if you don't, some people's .psqlrc files may cause tests to
fail, which nobody wants.  The more general principle is that the
regression tests should ideally pass regardless of the local machine
configuration.  It's proven impractical to make that universally true,
but that doesn't make it a bad goal.  Now, for all I know it may be
that setting PROVE_FLAGS can never cause a regression failure, but I
still tend to agree with Peter that the regression tests framework
should insulate us from the surrounding environment rather than
absorbing settings from it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Tom Lane

Piotr Stefaniak  writes:
> On 2017-05-17 19:16, Alvaro Herrera wrote:
>> We have someone who has studied the BSD indent code and even sent us
>> patches to fix quite a few bugs, but we've largely ignored his efforts
>> so far.  I propose we take that indent as part of our repo, and patch it
>> to our preferences.

> That, I assume, would be me. Coincidentally, I'm about to push my fixes
> upstream (FreeBSD). Before that happens, my changes can be obtained from
> https://github.com/pstef/freebsd_indent and tested, if anyone wishes.

Yes, I was just reviewing those threads.  IIUC, the current proposal is
to adopt FreeBSD's version of indent as being less buggy and better
maintained than NetBSD's, and then patch it as necessary.

We'd put off the decision what to do exactly until a more suitable time,
but I think that time is now.  What I think we ought to do is go ahead
and indent the tree with current pgindent, and then we have a basis for
experimenting with replacement versions and seeing what they'll do
differently.  If we can make a decision and adopt any changes within
the next few weeks, I think that that timing will be about the least pain
we can hope for.

If we do anything with as much impact as changing the indentation rule
for string continuations, I will certainly vote for running the new
pgindent over the back branches too.  I still bear the scars of dealing
with the comment-reflowing changes that Bruce put in circa 8.1 --- that
broke just about every back-patching effort for the next five years.
I don't want to go through that again.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

2017-05-17 Thread Tom Lane

Robert Haas  writes:
> On Tue, May 16, 2017 at 4:25 PM, Jeff Davis  wrote:
>> Why can't hash partitions be stored in tables the same way as we do TOAST?
>> That should take care of the naming problem.

> Hmm, yeah, something like that could be done, but every place where
> you are currently allowed to refer to a partition by name would have
> to be be changed to accept some other syntax for specifying the
> partition.

Uh ... toast tables have regular names, and can be specified in commands
just like any other table.  I don't see why these "auto" partition tables
couldn't be handled the same way.

> Beyond that, I think it's a bad idea to make hash partitions behave
> completely differently from list and range partitions.

I think the question is whether we are going to make a distinction between
logical partitions (where the data division rule makes some sense to the
user) and physical partitions (where it needn't).  I think it might be
perfectly reasonable for those to behave differently.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Andres Freund

On 2017-05-17 13:35:22 -0400, Tom Lane wrote:
> Not sure about actually incorporating it into our repo.  Doing so would
> make it easier for people to use, for sure, and the license seems to be
> regular 3-clause BSD, so that angle is OK.  But do we want to be carrying
> around another 150K of source code?

150k? Isn't it like 3-4k?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> Not sure about actually incorporating it into our repo.  Doing so would
>> make it easier for people to use, for sure, and the license seems to be
>> regular 3-clause BSD, so that angle is OK.  But do we want to be carrying
>> around another 150K of source code?

> The alternatives are

> 1. rely on the dead code we've been using so far (the old BSD indent
> patched with our Pg-specific tweaks), or

> 2. rely on someone else's upstream code -- in this case, FreeBSD's as
> patched by Piotr.

> Now that Piotr's is about to find a home, perhaps it's okay for us to
> rely on that one.  I just didn't like the idea of running something from
> Piotr's personal repo.

Well, "pg_bsd_indent is whatever you can find in the FreeBSD repo" is
not a rule that is going to work either.  We need to have a standardized
version that all developers can run and get the same results.  So I
think we'll either have a blessed tarball that we pass around (same
as we do now), or we'll put it into our own tree.  I don't really see
much downside to the latter except bloat.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix hard-coded index in make_partition_op_expr

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 5:38 AM, Jeevan Ladhe
 wrote:
> While browsing through the partitioning code, I noticed that a recent commit
> f8bffe9e6d700fd34759a92e47930ce9ba7dcbd5, which fixes multi-column range
> partitioning constraints, introduced a function make_partition_op_expr, that
> takes keynum as a input parameter to identify the index of the partition
> key.
> In case of range partition we can have multiple partition keys but for list
> partition we have only one. Considering that, I think following code does
> not
> cause any side-effect logically(and may be a oversight while moving the code
> from function get_qual_for_list to this function):
>
> saopexpr->inputcollid = key->partcollation[0];
> saopexpr->args = list_make2(arg1, arg2);
>
> But as we have keynum now, should we be using it to index
> key->partcollation,
> instead of a hard coded '0'.

Agreed.  Committed your patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Tom Lane

Andres Freund  writes:
> On 2017-05-17 13:35:22 -0400, Tom Lane wrote:
>> Not sure about actually incorporating it into our repo.  Doing so would
>> make it easier for people to use, for sure, and the license seems to be
>> regular 3-clause BSD, so that angle is OK.  But do we want to be carrying
>> around another 150K of source code?

> 150k? Isn't it like 3-4k?

The version we're using now is

$ tar xfz ~/archive/pg_bsd_indent-1.3.tar.gz 
$ wc pg_bsd_indent/*
38122928 pg_bsd_indent/Makefile
   107831   4835 pg_bsd_indent/README
   508   1743  11988 pg_bsd_indent/args.c
   569   2727  14732 pg_bsd_indent/indent.1
  1288   5677  37815 pg_bsd_indent/indent.c
   101671   4251 pg_bsd_indent/indent_codes.h
   367   2376  15206 pg_bsd_indent/indent_globs.h
   685   2781  18045 pg_bsd_indent/io.c
   634   2709  17017 pg_bsd_indent/lexi.c
   306   1133   8019 pg_bsd_indent/netbsd.patch
   366   1659  10953 pg_bsd_indent/parse.c
   478   2427  15199 pg_bsd_indent/pr_comment.c
  5447  24856 158988 total

(Note I was counting bytes not LOC.)

I've not looked at FreeBSD's version, but I'll bet a nickel it's bigger.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Andres Freund

On 2017-05-17 14:44:31 -0400, Tom Lane wrote:
> $ tar xfz ~/archive/pg_bsd_indent-1.3.tar.gz 
> $ wc pg_bsd_indent/*
> 38122928 pg_bsd_indent/Makefile
>107831   4835 pg_bsd_indent/README
>508   1743  11988 pg_bsd_indent/args.c
>569   2727  14732 pg_bsd_indent/indent.1
>   1288   5677  37815 pg_bsd_indent/indent.c
>101671   4251 pg_bsd_indent/indent_codes.h
>367   2376  15206 pg_bsd_indent/indent_globs.h
>685   2781  18045 pg_bsd_indent/io.c
>634   2709  17017 pg_bsd_indent/lexi.c
>306   1133   8019 pg_bsd_indent/netbsd.patch
>366   1659  10953 pg_bsd_indent/parse.c
>478   2427  15199 pg_bsd_indent/pr_comment.c
>   5447  24856 158988 total


> (Note I was counting bytes not LOC.)

Ah, that explains that ;)


> I've not looked at FreeBSD's version, but I'll bet a nickel it's bigger.

Not really, even counting the added tests.

  1275   5805  38674 freebsd_indent/indent.c
   595   2867  15020 freebsd_indent/indent.1
 9 25240 freebsd_indent/Makefile
   342   1451   9587 freebsd_indent/parse.c
   343   2234  14637 freebsd_indent/indent_globs.h
   356   1771  11311 freebsd_indent/pr_comment.c
   522   2190  14063 freebsd_indent/io.c
18 48361 freebsd_indent/Makefile.depend
 7 18 89 freebsd_indent/tests/cppelsecom.0.stdout
 0  1  5 freebsd_indent/tests/nsac.pro
52173   1093 freebsd_indent/tests/comments.0
 6 11 54 freebsd_indent/tests/nsac.0.stdout
 0  1  4 freebsd_indent/tests/label.pro
 8 20 96 freebsd_indent/tests/float.0.stdout
 0  1  4 freebsd_indent/tests/sac.pro
 9 20127 freebsd_indent/tests/surplusbad.0.stdout
60177   1127 freebsd_indent/tests/comments.0.stdout
 0  1 22 freebsd_indent/tests/types_from_file.pro
 7 18 86 freebsd_indent/tests/cppelsecom.0
30 46284 freebsd_indent/tests/f_decls.0.stdout
23 50571 freebsd_indent/tests/parens.0
14 35246 freebsd_indent/tests/list_head.0.stdout
73147900 freebsd_indent/tests/declarations.0.stdout
16 35242 freebsd_indent/tests/list_head.0
21 51214 freebsd_indent/tests/struct.0
 6 20 95 freebsd_indent/tests/float.0
42118555 freebsd_indent/tests/elsecomment.0
13 31134 freebsd_indent/tests/label.0
23 50558 freebsd_indent/tests/parens.0.stdout
27 51286 freebsd_indent/tests/f_decls.0
 9 20125 freebsd_indent/tests/surplusbad.0
 9 31208 freebsd_indent/tests/binary.0
 4 12 54 freebsd_indent/tests/nsac.0
 0  1  4 freebsd_indent/tests/surplusbad.pro
 4 12 54 freebsd_indent/tests/sac.0
 1  3 15 freebsd_indent/tests/parens.pro
 5 19 95 freebsd_indent/tests/offsetof.0
 6 17102 freebsd_indent/tests/wchar.0.stdout
47118578 freebsd_indent/tests/elsecomment.0.stdout
79143850 freebsd_indent/tests/declarations.0
 3 14 60 freebsd_indent/tests/types_from_file.0
 0  1  3 freebsd_indent/tests/elsecomment.pro
 6 12 55 freebsd_indent/tests/sac.0.stdout
 1  2  3 freebsd_indent/tests/types_from_file.list
 6 17 94 freebsd_indent/tests/wchar.0
 3 15 62 freebsd_indent/tests/types_from_file.0.stdout
11 31209 freebsd_indent/tests/binary.0.stdout
 7 19 96 freebsd_indent/tests/offsetof.0.stdout
14 31207 freebsd_indent/tests/label.0.stdout
 0  1  4 freebsd_indent/tests/comments.pro
23 47215 freebsd_indent/tests/struct.0.stdout
   100818   4720 freebsd_indent/README
51300   2082 freebsd_indent/indent.h
   351   1415  10848 freebsd_indent/args.c
75425   2740 freebsd_indent/indent_codes.h
   654   2577  17238 freebsd_indent/lexi.c
  5366  23567 151406 total


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix hard-coded index in make_partition_op_expr

2017-05-17 Thread Jeevan Ladhe

On Thu, May 18, 2017 at 12:13 AM, Robert Haas  wrote:
>
> Agreed.  Committed your patch.
>

 Thanks Robert!

[HACKERS] postgres_fdw aggregation pushdown has collation change in 10beta.

2017-05-17 Thread Jeff Janes

It is shipping collation-sensitive aggregates between servers which have
different collations.

commit 7012b132d07c2b4ea15b0b3cb1ea9f3278801d98
Author: Robert Haas 
Date:   Fri Oct 21 09:54:29 2016 -0400

postgres_fdw: Push down aggregates to remote servers.


I've attached a reproducing case.  Before this commit, the two final
queries give the same answer, and after they give different answers.  Maybe
this isn't considered a bug?  Is it just one of the "surprising semantic
anomalies may arise when types or collations do not match"?  It should be
able to know what collation the local definition of the foreign table has;
couldn't it pass that collation over the foreign side?

I don't really care, I was just using min as a way to get an arbitrary
value in the cases where there are more than one, but I found the change
surprising.

Cheers,

Jeff
drop database foobar_C;
drop database foobar_US;
create database foobar_C encoding 'utf-8' lc_collate "C" template template0;
create database foobar_US encoding 'utf-8' lc_collate "en_US.utf8" template 
template0;

\c foobar_us

create table remote1 (x text);
\copy remote1 from stdin
a
P
\.

\c foobar_c

CREATE EXTENSION IF NOT EXISTS postgres_fdw WITH SCHEMA public;

CREATE SERVER remote_server FOREIGN DATA WRAPPER postgres_fdw options (dbname 
'foobar_us') ;
CREATE USER MAPPING FOR postgres SERVER remote_server OPTIONS (
"user" 'postgres'
);

CREATE FOREIGN TABLE remote1 (
   x text
)
SERVER remote_server
OPTIONS (
schema_name 'public',
table_name 'remote1',
use_remote_estimate 'true'
);

create table local1 (x text);
\copy local1 from stdin
a
P
\.

select min(x) from local1;
select min(x) from remote1;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

2017-05-17 Thread Robert Haas

On Wed, May 17, 2017 at 2:35 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, May 16, 2017 at 4:25 PM, Jeff Davis  wrote:
>>> Why can't hash partitions be stored in tables the same way as we do TOAST?
>>> That should take care of the naming problem.
>
>> Hmm, yeah, something like that could be done, but every place where
>> you are currently allowed to refer to a partition by name would have
>> to be be changed to accept some other syntax for specifying the
>> partition.
>
> Uh ... toast tables have regular names, and can be specified in commands
> just like any other table.  I don't see why these "auto" partition tables
> couldn't be handled the same way.

Really?  That seems like a huge usability fail to me.  If somebody
wants to create an index on one partition of a hash-partitioned table,
or reindex an index, do you really want them to have to dig out an
internal name to do it?  And how exactly would you dump and restore
the partitions and their indexes?  It's true that there are some
operations that can be performed directly on a TOAST table, but the
saving grace is that you usually don't need to do any of them.  That
won't be true here.

>> Beyond that, I think it's a bad idea to make hash partitions behave
>> completely differently from list and range partitions.
>
> I think the question is whether we are going to make a distinction between
> logical partitions (where the data division rule makes some sense to the
> user) and physical partitions (where it needn't).  I think it might be
> perfectly reasonable for those to behave differently.

I don't think I'd like to go so far as to say that it's unreasonable,
but I certainly wouldn't say I'm optimistic about such a design.  I do
not think that it is going to work to conceal from the user that the
partitions are really separate tables with their own indexes.  I also
think that trying to make such a thing work is just going to lead to a
lot of time and energy spent trying to paper over problems that are
basically self-inflicted, and that papering over those problems won't
really end up having any value for users.

Remember, the chain of reasoning here is something like:

1. To handle dump-and-reload the way we partitioning does today, hash
functions would need to be portable across encodings.
2. That's impractically difficult.
3. So let's always load data through the top-parent.
4. But that could fail due to e.g. a UNIQUE index on an individual
child, so let's try to prohibit all of the things that could be done
to an individual partition that could cause a reload failure.
5. And then for good measure let's hide the existence of the partitions, too.

Every step in that chain of logic has a certain sense to it, but none
of them are exactly water-tight.  #1 is basically a value judgement:
would people rather (a) have faster hash functions, or (b) would they
rather be able to port a database to a different encoding without
having rows move between hash functions?  The statement is only true
if you think it's the latter, but I tend to think it's the former.  #2
is a judgement that the performance characteristics of
as-yet-unwritten portable hashing will be so bad that nobody could
possibly be satisfied with it.  #3 is a great idea as an optional
behavior, but it's only a strict necessity if you're totally committed
to #1 and #2.  It also has some performance cost, which makes it
somewhat undesirable as a default behavior.  #4 is *probably* a
necessary consequence of #3.  I don't know what the argument for #5 is
unless it's that #4 isn't hard enough already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 10beta1/m68k: static assertion failed: "MAXALIGN too small to fit int32"

2017-05-17 Thread Christoph Berg

Not sure if a lot of people still care about m68k, but it's still one
of the unofficial Debian ports (it used to be the first non-x86 port
done decades ago):

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 
-fdebug-prefix-map=/<>=. -fstack-protector-strong -Wformat 
-Werror=format-security -I/usr/include/mit-krb5 -no-pie 
-I../../../../src/include -I/<>/build/../src/include -Wdate-time 
-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include/tcl8.6 
  -c -o slab.o /<>/build/../src/backend/utils/mmgr/slab.c
In file included from /<>/build/../src/include/postgres.h:47:0,
 from 
/<>/build/../src/backend/utils/mmgr/slab.c:53:
/<>/build/../src/backend/utils/mmgr/slab.c: In function 
'SlabContextCreate':
/<>/build/../src/include/c.h:753:7: error: static assertion 
failed: "MAXALIGN too small to fit int32"
  do { _Static_assert(condition, errmessage); } while(0)
   ^
/<>/build/../src/backend/utils/mmgr/slab.c:198:2: note: in 
expansion of macro 'StaticAssertStmt'
  StaticAssertStmt(MAXIMUM_ALIGNOF >= sizeof(int),
  ^~~~
: recipe for target 'slab.o' failed
make[5]: *** [slab.o] Error 1


The code there is:

/*
 * SlabContextCreate
 *  Create a new Slab context.
 *
 * parent: parent context, or NULL if top-level context
 * name: name of context (for debugging --- string will be copied)
 * blockSize: allocation block size
 * chunkSize: allocation chunk size
 *
 * The chunkSize may not exceed:
 *  MAXALIGN_DOWN(SIZE_MAX) - MAXALIGN(sizeof(SlabBlock)) - SLAB_CHUNKHDRSZ
 *
 */
MemoryContext
SlabContextCreate(MemoryContext parent,
  const char *name,
  Size blockSize,
  Size chunkSize)
{
int chunksPerBlock;
SizefullChunkSize;
SizefreelistSize;
SlabContext *slab;

StaticAssertStmt(offsetof(SlabChunk, slab) +sizeof(MemoryContext) ==
 MAXALIGN(sizeof(SlabChunk)),
 "padding calculation in SlabChunk is wrong");

/* otherwise the linked list inside freed chunk isn't guaranteed to fit */
StaticAssertStmt(MAXIMUM_ALIGNOF >= sizeof(int),
 "MAXALIGN too small to fit int32");

/* chunk, including SLAB header (both addresses nicely aligned) */
fullChunkSize = MAXALIGN(sizeof(SlabChunk) + MAXALIGN(chunkSize));


I don't have the pg_config.h file at hand, but the 9.6 version has
this:

/* The normal alignment of `double', in bytes. */
#define ALIGNOF_DOUBLE 2

/* The normal alignment of `int', in bytes. */
#define ALIGNOF_INT 2

/* The normal alignment of `long', in bytes. */
#define ALIGNOF_LONG 2

/* The normal alignment of `long long int', in bytes. */
#define ALIGNOF_LONG_LONG_INT 2

/* The normal alignment of `short', in bytes. */
#define ALIGNOF_SHORT 2

/* Define as the maximum alignment requirement of any C data type. */
#define MAXIMUM_ALIGNOF 2


I don't think anyone is actually going to run a PG server on m68k, but
the same source package is building libpq5, which is not dispensable.

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Piotr Stefaniak

On 2017-05-17 20:31, Tom Lane wrote:
> Piotr Stefaniak  writes:
>> On 2017-05-17 19:16, Alvaro Herrera wrote:
>>> We have someone who has studied the BSD indent code and even sent us
>>> patches to fix quite a few bugs, but we've largely ignored his efforts
>>> so far.  I propose we take that indent as part of our repo, and patch it
>>> to our preferences.
> 
>> That, I assume, would be me. Coincidentally, I'm about to push my fixes
>> upstream (FreeBSD). Before that happens, my changes can be obtained from
>> https://github.com/pstef/freebsd_indent and tested, if anyone wishes.
> 
> Yes, I was just reviewing those threads.  IIUC, the current proposal is
> to adopt FreeBSD's version of indent as being less buggy and better
> maintained than NetBSD's, and then patch it as necessary.

One of my goals here is to fix bugs in FreeBSD indent so it's easier to
develop and maintain. I've also tried hard to not introduce serious
regressions for FreeBSD which also uses indent (for some sub-projects
even automatically). This is an ongoing effort.

What I describe below I believe to have been largely achieved.

Another goal is to incorporate all custom changes that make current
pg_bsd_indent yet another, unique indent fork, into FreeBSD indent - so
that maintaining patches for some other indent by the Postgres project
wouldn't be necessary. I understand the need to have control over a copy
of it within the Postgres project but it would be nice to not
effectively fork it by carrying custom patches around.

The third significant issue I kept in my mind was to shift some
work-arounds from pgindent to indent. When I use my indent as the base
for pgindent and set its options like this:
-bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 -lp -nip -npro -bbb
-cli1 -sac -ts4 -cp10
I can remove most of the work-arounds written in the perl script and
still get pretty decent results. But I expect some debate over a few things.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Piotr Stefaniak  writes:
> That, I assume, would be me. Coincidentally, I'm about to push my fixes
> upstream (FreeBSD). Before that happens, my changes can be obtained from
> https://github.com/pstef/freebsd_indent and tested, if anyone wishes.

Do you have recommendations for the switches to use with this to
match PG style?  It doesn't look like it takes quite the same
switches as the old netbsd code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Tom Lane

Piotr Stefaniak  writes:
> The third significant issue I kept in my mind was to shift some
> work-arounds from pgindent to indent.

Yeah, I was wondering about that too.

> When I use my indent as the base
> for pgindent and set its options like this:
> -bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 -lp -nip -npro -bbb
> -cli1 -sac -ts4 -cp10

Ah, thanks, ignore the message I just sent asking for that ...

> I can remove most of the work-arounds written in the perl script and
> still get pretty decent results. But I expect some debate over a few things.

... but what parts of the perl script do you remove?  I'm trying
to duplicate whatever results you're currently getting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Piotr Stefaniak

On 2017-05-17 22:11, Tom Lane wrote:
> Piotr Stefaniak  writes:
>> The third significant issue I kept in my mind was to shift some
>> work-arounds from pgindent to indent.
> 
> Yeah, I was wondering about that too.
> 
>> When I use my indent as the base
>> for pgindent and set its options like this:
>> -bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 -lp -nip -npro -bbb
>> -cli1 -sac -ts4 -cp10
> 
> Ah, thanks, ignore the message I just sent asking for that ...
> 
>> I can remove most of the work-arounds written in the perl script and
>> still get pretty decent results. But I expect some debate over a few things.
> 
> ... but what parts of the perl script do you remove?  I'm trying
> to duplicate whatever results you're currently getting.

Full copy of my pgindent attached.  Changes commented below.


@@ -17,7 +17,7 @@ my $devnull= File::Spec->devnull;

 # Common indent settings
 my $indent_opts =
-  "-bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 -lp -nip -npro
-bbb";
+  "-bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 -lp -nip -npro
-bbb -cli1 -sac -ts4 -cp10";

 # indent-dependent settings
 my $extra_opts = "";

The additional options are necessary. Mostly they replace the work-arounds.


@@ -198,60 +198,16 @@ sub pre_indent
 {
my $source = shift;

-   # remove trailing whitespace
-   $source =~ s/[ \t]+$//gm;
-

I'm not sure there won't be any trailing white-space, but I've committed
changes that should limit it.


## Comments

# Convert // comments to /* */
$source =~ s!^([ \t]*)//(.*)$!$1/* $2 */!gm;

-   # 'else' followed by a single-line comment, followed by
-   # a brace on the next line confuses BSD indent, so we push
-   # the comment down to the next line, then later pull it
-   # back up again.  Add space before _PGMV or indent will add
-   # it for us.
-   # AMD: A symptom of not getting this right is that you see
errors like:
-   # FILE: ../../../src/backend/rewrite/rewriteHandler.c
-   # Error@2259:
-   # Stuff missing from end of file
-   $source =~
- s!(\}|[ \t])else[ \t]*(/\*)(.*\*/)[ \t]*$!$1else\n$2
_PGMV$3!gm;
-
-   # Indent multi-line after-'else' comment so BSD indent will move it
-   # properly. We already moved down single-line comments above.
-   # Check for '*' to make sure we are not in a single-line comment
that
-   # has other text on the line.
-   $source =~ s!(\}|[ \t])else[ \t]*(/\*[^*]*)[ \t]*$!$1else\n
$2!gm;

These are definitely fixed.


## Other

-   # Work around bug where function that defines no local variables
-   # misindents switch() case lines and line after #else.  Do not do
-   # for struct/enum.
-   my @srclines = split(/\n/, $source);
-   foreach my $lno (1 .. $#srclines)
-   {
-   my $l2 = $srclines[$lno];
-
-   # Line is only a single open brace in column 0
-   next unless $l2 =~ /^\{[ \t]*$/;
-
-   # previous line has a closing paren
-   next unless $srclines[ $lno - 1 ] =~ /\)/;
-
-   # previous line was struct, etc.
-   next
- if $srclines[ $lno - 1 ] =~
- m!=|^(struct|enum|[ \t]*typedef|extern[ \t]+"C")!;
-
-   $srclines[$lno] = "$l2\nint pgindent_func_no_var_fix;";
-   }
-   $source = join("\n", @srclines) . "\n";# make sure there's a
final \n
-

I don't have time now to double-check, but the above shouldn't be needed
anymore.


-   # Pull up single-line comment after 'else' that was pulled down
above
-   $source =~ s!else\n[ \t]+/\* _PGMV!else\t/*!g;
-
-   # Indent single-line after-'else' comment by only one tab.
-   $source =~ s!(\}|[ \t])else[ \t]+(/\*.*\*/)[ \t]*$!$1else\t$2!gm;
-
-   # Add tab before comments with no whitespace before them (on a
tab stop)
-   $source =~ s!(\S)(/\*.*\*/)$!$1\t$2!gm;
-
-   # Remove blank line between opening brace and block comment.
-   $source =~ s!(\t*\{\n)\n([ \t]+/\*)$!$1$2!gm;
-

These are almost definitely fixed in indent.


-   # cpp conditionals
-
-   # Reduce whitespace between #endif and comments to one tab
-   $source =~ s!^\#endif[ \t]+/\*!#endif   /*!gm;
-
## Functions

-   # Work around misindenting of function with no variables defined.
-   $source =~ s!^[ \t]*int[ \t]+pgindent_func_no_var_fix;[
\t]*\n{1,2}!!gm;
-

These have likely been fixed.


-   ## Other
-
-   # Remove too much indenting after closing brace.
-   $source =~ s!^\}\t[ \t]+!}\t!gm;
-
-   # Workaround indent bug that places excessive space before 'static'.
-   $source =~ s!^static[ \t]+!static !gm;
-
-   # Remove leading whitespace from typedefs
-   $source =~ s!^[ \t]+typedef enum!typedef enum!gm
- if $source_filename =~ 'libpq-(fe|events).h$';

I believe these have been fixed as well.


-   # Remove trailing blank lines
-

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Piotr Stefaniak

On 2017-05-17 17:55, Peter Eisentraut wrote:
> On 5/17/17 10:14, Tom Lane wrote:
>> What I was concerned about was that pgindent will reindent the second
>> line so that it's impossible to tell whether the spacing is correct.
> 
> pgindent moving string continuations to the left is a completely
> terrible behavior anyway and we should look into changing that.  Just
> look at the mess it makes out of SELECT queries in pg_dump.c.

If I remember correctly, it tries to right-align string literals to
whatever -l ("Maximum length of an output line") was set to.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pulling up more complicated subqueries

2017-05-17 Thread David Rowley

On 18 May 2017 at 04:30, Robert Haas  wrote:
> On Wed, May 17, 2017 at 11:08 AM, Heikki Linnakangas  wrote:
>> That's not a straight semi-join, but we could still turn it into a new kind
>> of LEFT-SEMI join. A left-semi join is like a left join, in that it returns
>> all rows from the left side, and NULLs for any non-matches. And like a
>> semi-join, it returns only one matching row from the right-side, if there
>> are duplicates. In the qual, replace the SubLink with an IS NOT NULL test.
> ...
>> This can be implemented using yet another new join type, a LEFT-UNIQUE join.
>> It's like a LEFT JOIN, but it must check that there are no duplicates in the
>> right-hand-side, and throw an error if there are (ERROR:  more than one row
>> returned by a subquery used as an expression).
>
> It seems like we might want to split what is currently called JoinType
> into two separate things -- one that is INNER/LEFT/RIGHT/FULL and the
> other that says what to do about multiple matches, which could be that
> they are expected, they are to be ignored (as in your LEFT-SEMI case),
> or they should error out (as in your LEFT-UNIQUE case).

I just wanted to mention that I almost got sucked down that hole with
unique joins. Instead, I'd recommend following the pattern of passing
bool flags down to the executor from the planner just like what is
done for inner_unique in, say make_hashjoin()

I did change unique joins at one point to overload the JoinType, but
it was a much more scary thing to do as there's lots of code in the
planner that does special things based on the join type, and the
footprint of your patch and risk factor start to grow pretty rapidly
once you do that.

I mention something around this in [1].

[1] 
https://www.postgresql.org/message-id/CAKJS1f_jRki1PQ4X-9UGKa-wnBhECQLnrxCX5haQzu4SDR_r2Q%40mail.gmail.com

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Alvaro Herrera

Piotr Stefaniak wrote:
> On 2017-05-17 17:55, Peter Eisentraut wrote:
> > On 5/17/17 10:14, Tom Lane wrote:
> >> What I was concerned about was that pgindent will reindent the second
> >> line so that it's impossible to tell whether the spacing is correct.
> > 
> > pgindent moving string continuations to the left is a completely
> > terrible behavior anyway and we should look into changing that.  Just
> > look at the mess it makes out of SELECT queries in pg_dump.c.
> 
> If I remember correctly, it tries to right-align string literals to
> whatever -l ("Maximum length of an output line") was set to.

Yeah, it does that (for error messages too).

I'm not sure what's the behavior we do want.  One choice is that the
continuation string opening quote should line up with the opening quote
in the previous line.  So for instance:

elog(ERROR, "this message is very long"
"and the second line may get over the 80 chars limit if we let 
it");

but I suppose some we would like the opening quote to line up with the
parens:

elog(ERROR, "this message is very long "
 "and the second line may get over the 80 chars limit if we let it");

I vote that the first form is best, even if it would cause me some pain
because my vim config doesn't line up continuations like that. (Not sure
I can tweak it to work that way.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Event triggers + table partitioning cause server crash in current master

2017-05-17 Thread Noah Misch

On Mon, May 15, 2017 at 03:02:54PM +0900, Amit Langote wrote:
> On 2017/05/14 12:03, Mark Dilger wrote:
> > I discovered a reproducible crash using event triggers in the current
> > development version, 29c7d5e483acaa74a0d06dd6c70b320bb315.
> > I was getting a crash before this version, and cloned a fresh copy of
> > the sources to be sure I was up to date, so I don't think the bug can be
> > attributed to Andres' commit.  (The prior version I was testing against
> > was heavily modified by me, so I recreated the bug using the latest
> > standard, unmodified sources.)
> > 
> > I create both before and after event triggers early in the regression test
> > schedule, which then fire here and there during the following tests, leading
> > fairly reproducibly to the server crashing somewhere during the test suite.
> > These crashes do not happen for me without the event triggers being added
> > to the tests.  Many tests show as 'FAILED' simply because the logging
> > that happens in the event triggers creates unexpected output for the test.
> > Those "failures" are expected.  The server crashes are not.
> > 
> > The server logs suggest the crashes might be related to partitioned tables.
> > 
> > Please find attached the patch that includes my changes to the sources
> > for recreating this bug.  The logs and regression.diffs are a bit large; let
> > me know if you need them.
> > 
> > I built using the command
> > 
> > ./configure --enable-cassert --enable-tap-tests && make -j4 && make check
> 
> Thanks for the report and providing steps to reproduce.
> 
> It seems that it is indeed a bug related to creating range-partitioned
> tables.  DefineRelation() calls AlterTableInternal() to add NOT NULL
> constraints on the range partition key columns, but the code fails to
> first initialize the event trigger context information.  Attached patch
> should fix that.
> 
> Thanks to the above test case, I also discovered that in the case of
> creating a partition, manipulations performed by MergeAttributes() on the
> input schema list may cause it to become invalid, that is, the List
> metadata (length) will no longer match the reality, because while the
> ListCells are deleted from the input list, the List pointer passed to
> list_delete_cell does not point to the same list.  This caused a crash
> when the CreateStmt in question was subsequently passed to copyObject,
> which tried to access CreateStmt.tableElts that has become invalid as just
> described.  The attached patch also takes care of that.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: pgindent (was Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)

2017-05-17 Thread Tom Lane

Piotr Stefaniak  writes:
> Full copy of my pgindent attached.  Changes commented below.

Thanks!  I ran this, along with the indent copy I pulled from your
github repo a couple hours ago, over the current PG tree (post
Bruce's run).  I got a diff extending to about 100K lines :-(
which I will not post here.  It seemed like a very large fraction
of that was that old pgindent chooses to use a space rather than
a tab if the tab would only move over one column.  This version
uses a tab anyway.

I hacked around that by putting back a detab/entab step at the end
using the existing subroutines in pgindent.  That about halved the
size of the diff, but it's still too big to post.  Much of what
I'm seeing with this version is randomly different decisions about
how far to indent comments, eg

@@ -101,8 +101,8 @@ typedef struct BloomOptions
 {
int32   vl_len_;/* varlena header (do not touch directly!) */
int bloomLength;/* length of signature in words (not bits!) */
-   int bitSize[INDEX_MAX_KEYS];/* # of bits generated for
-* each index key */
+   int bitSize[INDEX_MAX_KEYS];/* # of bits generated for each
+* index key */
 } BloomOptions;
 
 /*

(I untabified the above fragment in the hope of making it more readable
in email.)

It does seem to be handling formatting around sizeof() calls a lot better
than the old code, as well as function pointer typedefs.  So those are
huge wins.  But can we avoid the changes mentioned above?  I'd like the
new version to only differ in ways that are clear improvements.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.

2017-05-17 Thread Tom Lane

Alvaro Herrera  writes:
> Piotr Stefaniak wrote:
>> If I remember correctly, it tries to right-align string literals to
>> whatever -l ("Maximum length of an output line") was set to.

> Yeah, it does that (for error messages too).

Piotr's version seems to at least do this more consistently than the
old version; for instance I notice this diff from Bruce's run:

@@ -1864,8 +1864,8 @@ describeOneTableDetails(const char *schemaname,
if (verbose)
printfPQExpBuffer(&buf,
  "SELECT inhparent::pg_catalog.regclass,"
-   "   pg_get_expr(c.relpartbound, inhrelid),"
- " pg_get_partition_constraintdef(inhrelid)"
+ " pg_get_expr(c.relpartbound, inhrelid),"
+ " pg_get_partition_constraintdef(inhrelid)"
  " FROM pg_catalog.pg_class c"
  " JOIN pg_catalog.pg_inherits"
  " ON c.oid = inhrelid"

(Again, untabified for clarity.)  However, it didn't do anything to any of
the horribly-formatted queries in pg_dump.c, so it's mostly following the
same rule as before.

> I'm not sure what's the behavior we do want.  One choice is that the
> continuation string opening quote should line up with the opening quote
> in the previous line.  So for instance:

Yeah, I'd vote for that one too.  If you want to line things up with
a function call paren, you can always start the whole literal on a fresh
line, as in the above example.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)

2017-05-17 Thread Thomas Munro

On Wed, May 17, 2017 at 7:42 PM, Thomas Munro
 wrote:
> On Wed, May 17, 2017 at 6:04 PM, Amit Langote
>  wrote:
>> targetRelInfo should instead be set to mtstate->rootResultRelInfo that was
>> set in ExecInitModifyTable() as described above, IOW, as follows:
>>
>> /* Partitioned table. */
>> if (mtstate->rootResultRelInfo != NULL)
>> targetRelInfo = mtstate->rootResultRelInfo;
>
> Ah, I see.  Thank you.  Fixed in the attached.

Here's a post-pgindent rebase.

Also, I discovered a preexisting bug that is independent of all this
inheritance stuff.  COPY in the batch optimisation case was failing to
capture transition tuples.  I thought about sending a separate patch
but this patch already has a regression test that covers it so I've
included it here.  It's this hunk:

@@ -2872,7 +2872,8 @@ CopyFromInsertBatch(CopyState cstate, EState
*estate, CommandId mycid,
 * anyway.
 */
else if (resultRelInfo->ri_TrigDesc != NULL &&
-resultRelInfo->ri_TrigDesc->trig_insert_after_row)
+(resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
+ resultRelInfo->ri_TrigDesc->trig_insert_new_table))
{
for (i = 0; i < nBufferedTuples; i++)
{

-- 
Thomas Munro
http://www.enterprisedb.com


transition-tuples-from-child-tables-v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG10 pgindent run

2017-05-17 Thread Bruce Momjian

On Tue, May 16, 2017 at 09:00:38PM -0400, Bruce Momjian wrote:
> On Tue, May 16, 2017 at 08:39:36PM -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > There we go:
> > > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=calliphoridae&dt=2017-05-16%2023:16:53&stg=typedefs
> > 
> > Yup, looks good now.  Thanks!
> > 
> > BTW, comparing the typedef list to what I got a few hours ago, I see
> > that "Function" is now a known type name, along with the UWhatever
> > symbols from ICU.  I wonder where that came from?  A quick grep
> > suggests that it's not going to mess anything up too badly, but it
> > sure seems like a poor choice for a globally visible typedef name.
> 
> OK, I assume we are good to go for Wednesday afternoon, UTC.  Thanks for
> the research.

Done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Server Crashes if try to provide slot_name='none' at the time of creating subscription.

2017-05-17 Thread Peter Eisentraut

On 5/16/17 22:21, Masahiko Sawada wrote:
> I think there are two bugs; pg_dump should dump slot_name = NONE
> instead of '' and subscription should not be created if given slot
> name is invalid. The validation check for replication slot name is
> done when creating it actually but I think it's more safer to check
> when CREATE SUBSCRIPTION. The bug related to setting slot_name = NONE
> should be fixed by attached 001 patch, and 002 patch prevents to be
> specified invalid replication slot name when CREATE SUBSCRIPTION and
> ALTER SUBSCRIPTION SET.

I have worked through these issues and came up with slightly different
patches.

The issue in pg_dump should have been checking PQgetisnull() before
reading the slot name.  That is now fixed.

The issue with slot_name = NONE causing a crash was fixed by adding
additional error checking.  I did not change it so that slot_name = NONE
would change the defaults of enabled and create_slot.  I think it's
confusing if options have dependencies like that.  In any case, creating
a subscription with slot_name = NONE is probably not useful anyway, so
as long as it doesn't crash, we don't need to make it excessively
user-friendly.

I don't think the subscription side should check the validity of the
replication slot name.  That is the responsibility of the publication
side.  The rules may well be different if you replicate between
different versions or different build options.  This is currently
working correctly: If the publication side doesn't like the slot you
specify, either because the name is invalid or the slot doesn't exist,
you get an appropriate error message.

Please check whether everything is working OK for you now.  I think this
open item is closed now.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres_fdw aggregation pushdown has collation change in 10beta.

2017-05-17 Thread Ashutosh Bapat

On Thu, May 18, 2017 at 12:37 AM, Jeff Janes  wrote:
> It is shipping collation-sensitive aggregates between servers which have
> different collations.
>
> commit 7012b132d07c2b4ea15b0b3cb1ea9f3278801d98
> Author: Robert Haas 
> Date:   Fri Oct 21 09:54:29 2016 -0400
>
> postgres_fdw: Push down aggregates to remote servers.
>
>
> I've attached a reproducing case.  Before this commit, the two final queries
> give the same answer, and after they give different answers.  Maybe this
> isn't considered a bug?  Is it just one of the "surprising semantic
> anomalies may arise when types or collations do not match"?  It should be
> able to know what collation the local definition of the foreign table has;
> couldn't it pass that collation over the foreign side?
>
> I don't really care, I was just using min as a way to get an arbitrary value
> in the cases where there are more than one, but I found the change
> surprising.

Per [1]
--
COLLATE collation

The COLLATE clause assigns a collation to the column (which must be of
a collatable data type). If not specified, the column data type's
default collation is used.
--

In your test you have not specified the collation for column x in
remote1, so it's considered as DEFAULT collation on the local server.
>From the POV of the local server, the collation of the column on the
foreign server is same as the default collation locally, so it doesn't
add any collation clause to the query. May be it could, but then the
problem is that the exact default collation on local server may not be
available on the foreign server. What foreign server has might be a
collation whose behaviour is same as local server's default collation
behaviour, as far as that column's data type is concerned.

But this is not something specific to aggregation, WHERE, ORDER BY
clauses pushed down to the foreign server have the same behaviour.

[1] https://www.postgresql.org/docs/devel/static/sql-createforeigntable.html

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [BUGS] BUG #14657: Server process segmentation fault in v10, May 10th dev snapshot

2017-05-17 Thread Amit Langote

On 2017/05/18 10:49, Amit Langote wrote:
> On 2017/05/18 2:14, Dilip Kumar wrote:
>> On Wed, May 17, 2017 at 7:41 PM,   wrote:
>>> (gdb) bt
>>> #0  0x0061ab1b in list_nth ()
>>> #1  0x005e4081 in ExecLockNonLeafAppendTables ()
>>> #2  0x005f4d52 in ExecInitMergeAppend ()
>>> #3  0x005e0365 in ExecInitNode ()
>>> #4  0x005f35a7 in ExecInitLimit ()
>>> #5  0x005e00f3 in ExecInitNode ()
>>> #6  0x005dd207 in standard_ExecutorStart ()
>>> #7  0x006f96d2 in PortalStart ()
>>> #8  0x006f5c7f in exec_simple_query ()
>>> #9  0x006f6fac in PostgresMain ()
>>> #10 0x00475cdc in ServerLoop ()
>>> #11 0x00692ffa in PostmasterMain ()
>>> #12 0x00476600 in main ()
> 
> Thanks for the test case Sveinn and thanks Dilip for analyzing.
> 
>> Seems like the issue is that the plans under multiple subroots are
>> pointing to the same partitioned_rels.
> 
> That's correct.
> 
>> If I am not getting it wrong "set_plan_refs(PlannerInfo *root, Plan
>> *plan, int rtoffset)" the rtoffset is specific to the subroot. Now,
>> problem is that set_plan_refs called for different subroot is updating
>> the same partition_rel info and make this value completely wrong which
>> will ultimately make ExecLockNonLeafAppendTables to access the out of
>> bound "rte" index.
> 
> Yes.
> 
>> set_plan_refs
>> {
>> [clipped]
>> case T_MergeAppend:
>> {
>> [clipped]
>>
>> foreach(l, splan->partitioned_rels)
>> {
>>  lfirst_int(l) += rtoffset;
>>
>>
>> I think the solution should be that create_merge_append_path make the
>> copy of partitioned_rels list?
> 
> Yes, partitioned_rels should be copied.
> 
>> Attached patch fixes the problem but I am not completely sure about the fix.
> 
> Thanks for creating the patch, although I think a better fix would be to
> make get_partitioned_child_rels() do the list_copy.  That way, any other
> users of partitioned_rels will not suffer the same issue.  Attached patch
> implements that, along with a regression test.
> 
> Added to the open items.

Oops, forgot to cc -hackers.  Patch attached again.

Thanks,
Amit
>From 73754e24178087cf8aa53904acfb032343be25df Mon Sep 17 00:00:00 2001
From: amit 
Date: Thu, 18 May 2017 10:45:44 +0900
Subject: [PATCH] Fix crash when partitioned table Append referenced in
 subplans

---
 src/backend/optimizer/plan/planner.c  |  2 +-
 src/test/regress/expected/inherit.out | 31 +++
 src/test/regress/sql/inherit.sql  | 10 ++
 3 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c4a5651abd..bc2797e42c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6076,7 +6076,7 @@ get_partitioned_child_rels(PlannerInfo *root, Index rti)
 
 		if (pc->parent_relid == rti)
 		{
-			result = pc->child_rels;
+			result = list_copy(pc->child_rels);
 			break;
 		}
 	}
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index af7090ba0d..35d182d599 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1918,3 +1918,34 @@ explain (costs off) select * from mcrparted where a = 20 and c > 20; -- scans mc
 (7 rows)
 
 drop table mcrparted;
+-- check that partitioned table Appends cope with being referenced in
+-- subplans
+create table parted_minmax (a int, b varchar(16)) partition by range (a);
+create table parted_minmax1 partition of parted_minmax for values from (1) to (10);
+create index parted_minmax1i on parted_minmax1 (a, b);
+insert into parted_minmax values (1,'12345');
+explain (costs off) select min(a), max(a) from parted_minmax where b = '12345';
+  QUERY PLAN   
+---
+ Result
+   InitPlan 1 (returns $0)
+ ->  Limit
+   ->  Merge Append
+ Sort Key: parted_minmax1.a
+ ->  Index Only Scan using parted_minmax1i on parted_minmax1
+   Index Cond: ((a IS NOT NULL) AND (b = '12345'::text))
+   InitPlan 2 (returns $1)
+ ->  Limit
+   ->  Merge Append
+ Sort Key: parted_minmax1_1.a DESC
+ ->  Index Only Scan Backward using parted_minmax1i on parted_minmax1 parted_minmax1_1
+   Index Cond: ((a IS NOT NULL) AND (b = '12345'::text))
+(13 rows)
+
+select min(a), max(a) from parted_minmax where b = '12345';
+ min | max 
+-+-
+   1 |   1
+(1 row)
+
+drop table parted_minmax;
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 7f34f43ec0..70fe971d51 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -661,3 +661,13 @@ explain (costs off) select * from mcr

Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-05-17 Thread Michael Paquier

On Thu, May 18, 2017 at 12:06 AM, Bossart, Nathan  wrote:
> I agree with you here, too.  I stopped short of allowing customers to 
> explicitly provide per-table options, so the example you provided wouldn’t 
> work here.  This is more applicable for something like the following:
>
> VACUUM (FREEZE, VERBOSE) foo, bar (a);
>
> In this case, the FREEZE and VERBOSE options are used for both tables.  
> However, we have a column list specified for ‘bar’, and the ANALYZE option is 
> implied when we specify a column list.  So when we process ‘bar’, we need to 
> apply the ANALYZE option, but we do not need it for ‘foo’.  For now, that is 
> all that this per-table options variable is used for.

Hm. One argument can be made here: having a column list defined in one
of the tables implies that ANALYZE is enforced for all the relations
listed instead of doing that only on the relations listing columns.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fix hard-coded index in make_partition_op_expr

2017-05-17 Thread Amit Langote

On 2017/05/18 3:43, Robert Haas wrote:
> On Wed, May 17, 2017 at 5:38 AM, Jeevan Ladhe
>  wrote:
>> While browsing through the partitioning code, I noticed that a recent commit
>> f8bffe9e6d700fd34759a92e47930ce9ba7dcbd5, which fixes multi-column range
>> partitioning constraints, introduced a function make_partition_op_expr, that
>> takes keynum as a input parameter to identify the index of the partition
>> key.
>> In case of range partition we can have multiple partition keys but for list
>> partition we have only one. Considering that, I think following code does
>> not
>> cause any side-effect logically(and may be a oversight while moving the code
>> from function get_qual_for_list to this function):
>>
>> saopexpr->inputcollid = key->partcollation[0];
>> saopexpr->args = list_make2(arg1, arg2);
>>
>> But as we have keynum now, should we be using it to index
>> key->partcollation,
>> instead of a hard coded '0'.
> 
> Agreed.  Committed your patch.

Me too, thanks both.

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 122 matches

Mail list logo