Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Ashutosh Bapat
On Thu, Aug 17, 2017 at 10:54 AM, Amit Langote
 wrote:
> On 2017/08/17 13:56, Ashutosh Bapat wrote:
>> On Thu, Aug 17, 2017 at 8:06 AM, Amit Langote
>>  wrote:
>>> On 2017/08/17 11:22, Robert Haas wrote:
 On Wed, Aug 16, 2017 at 10:12 PM, Amit Langote
  wrote:
>> In the catalogs we are using full "partitioned" e.g. 
>> pg_partitioned_table. May
>> be we should name the column as "inhchildpartitioned".
>
> Sure.

 I suggest inhpartitioned or inhispartition.  inhchildpartitioned seems too 
 long.
>>>
>>> inhchildpartitioned indeed seems long.
>>>
>>> Since we storing if the child table (one with the OID inhrelid) is
>>> partitioned, inhpartitioned seems best to me.  Will implement that.
>>
>> inhchildpartitioned is long but clearly tells that the child table is
>> partitioned, not the parent. pg_inherit can have parents which are not
>> partitioned, so it's better to have self-explanatory catalog name. I
>> am fine with some other name as long as it's clear.
>
> OTOH, the pg_inherits field that stores the OID of the child table does
> not mention "child" in its name (inhrelid), although you are right that
> inhpartitioned can be taken to mean that the inheritance parent
> (inhparent) is partitioned.  In any case, system catalog documentation
> which clearly states what's what might be the best guide for the confused.
>
Sorry, I overlooked this detail. To me it means that the table is
driven by the child and inhpartitioned looks good then.


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Amit Langote
On 2017/08/17 13:56, Ashutosh Bapat wrote:
> On Thu, Aug 17, 2017 at 8:06 AM, Amit Langote
>  wrote:
>> On 2017/08/17 11:22, Robert Haas wrote:
>>> On Wed, Aug 16, 2017 at 10:12 PM, Amit Langote
>>>  wrote:
> In the catalogs we are using full "partitioned" e.g. 
> pg_partitioned_table. May
> be we should name the column as "inhchildpartitioned".

 Sure.
>>>
>>> I suggest inhpartitioned or inhispartition.  inhchildpartitioned seems too 
>>> long.
>>
>> inhchildpartitioned indeed seems long.
>>
>> Since we storing if the child table (one with the OID inhrelid) is
>> partitioned, inhpartitioned seems best to me.  Will implement that.
> 
> inhchildpartitioned is long but clearly tells that the child table is
> partitioned, not the parent. pg_inherit can have parents which are not
> partitioned, so it's better to have self-explanatory catalog name. I
> am fine with some other name as long as it's clear.

OTOH, the pg_inherits field that stores the OID of the child table does
not mention "child" in its name (inhrelid), although you are right that
inhpartitioned can be taken to mean that the inheritance parent
(inhparent) is partitioned.  In any case, system catalog documentation
which clearly states what's what might be the best guide for the confused.

Of course, we can add a comment in pg_inherits.h next to the field
explaining what it is for those reading the source code and confused about
what inhpartitioned means.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 2:08 PM, Masahiko Sawada  wrote:
> On Wed, Aug 16, 2017 at 4:37 PM, Michael Paquier
>  wrote:
>> On Wed, Aug 16, 2017 at 4:24 PM, Masahiko Sawada  
>> wrote:
>>> FWIW, in my opinion if tte current behavior of 'N(a,b)' could confuse
>>> users and we want to break the backward compatibility, I'd rather like
>>> to remove that style in PostgreSQL 10 and to raise an syntax error to
>>> user for more safety. Also, since the syntax 'a, b' might be opaque
>>> for new users who don't know the history of s_s_names syntax, we could
>>> unify its syntax to '[ANY|FIRST] N (a, b, ...)' syntax while keeping
>>> the '*'.
>>
>> I find the removal of a syntax in release N for something introduced
>> in release (N - 1) a bit hard to swallow from the user prospective.
>> What about just issuing a warning instead and say that the use of
>> ANY/FIRST is recommended? It costs nothing in maintenance to keep it
>> around.
>
> Yeah, I think that would be better. If we decide to not make quorum
> commit the default we can issue a warning in docs. Attached a draft
> patch.

I had in mind a ereport(WARNING) in create_syncrep_config. Extra
thoughts/opinions welcome.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Orphaned files in base/[oid]

2017-08-16 Thread Chris Travers
On Wed, Aug 16, 2017 at 7:15 PM, Andres Freund  wrote:

>
>
> I think this entirely is the wrong approach.  We shouldn't add weird
> check commands that require locks on pg_class, we should avoid leaving
> the orphaned files in the first place.  I've upthread outlined
> approached how to do so.
>

And in the mean time I suppose there is no reason that the approach I have
outlined cannot be used by an external program.I think it is sane to
draw the line at having the db responsible for cleaning up after itself
when something goes wrong and leave external programs to do after-the-fact
review and cleanup.

>
> Greetings,
>
> Andres Freund
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-08-16 Thread Masahiko Sawada
On Wed, Aug 16, 2017 at 4:37 PM, Michael Paquier
 wrote:
> On Wed, Aug 16, 2017 at 4:24 PM, Masahiko Sawada  
> wrote:
>> FWIW, in my opinion if tte current behavior of 'N(a,b)' could confuse
>> users and we want to break the backward compatibility, I'd rather like
>> to remove that style in PostgreSQL 10 and to raise an syntax error to
>> user for more safety. Also, since the syntax 'a, b' might be opaque
>> for new users who don't know the history of s_s_names syntax, we could
>> unify its syntax to '[ANY|FIRST] N (a, b, ...)' syntax while keeping
>> the '*'.
>
> I find the removal of a syntax in release N for something introduced
> in release (N - 1) a bit hard to swallow from the user prospective.
> What about just issuing a warning instead and say that the use of
> ANY/FIRST is recommended? It costs nothing in maintenance to keep it
> around.

Yeah, I think that would be better. If we decide to not make quorum
commit the default we can issue a warning in docs. Attached a draft
patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


warning_s_s_names.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-16 Thread Thomas Munro
On Thu, Aug 17, 2017 at 4:37 PM, Amit Kapila  wrote:
> Note - enable_gathermerge is not present in postgresql.conf. I think
> we should add it in the postgresql.conf.sample file.

+1

See also 
https://www.postgresql.org/message-id/CAEepm=0b7ym9mzsviq1d-hnt4koarvejvsfayvfyknv-pvd...@mail.gmail.com
.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 1:28 PM, Tom Lane  wrote:
> -1 ... every datatype I/O function is entitled to assume it's being
> invoked inside a transaction.  I do not think we should break that
> on a case-by-case basis.  So using timestamptz_in directly in xlog.c
> was a bad idea, and we need to rethink that.

[evil mode]
Why not just calling SetCurrentStatementStartTimestamp() before
parsing the timestamp? A bit hacky, but I think that it would work.
[/evil mode]

> Not sure offhand how to mechanize that given the current code
> structure.  We don't want to duplicate all the infrastructure
> used by timestamptz_in, certainly ... but how much of that
> code is depending in some way on being inside a transaction?
> I doubt that this issue with "now" is the only soft spot.

tomorrow, today, etc are other things.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Ashutosh Bapat
On Thu, Aug 17, 2017 at 8:06 AM, Amit Langote
 wrote:
> On 2017/08/17 11:22, Robert Haas wrote:
>> On Wed, Aug 16, 2017 at 10:12 PM, Amit Langote
>>  wrote:
 In the catalogs we are using full "partitioned" e.g. pg_partitioned_table. 
 May
 be we should name the column as "inhchildpartitioned".
>>>
>>> Sure.
>>
>> I suggest inhpartitioned or inhispartition.  inhchildpartitioned seems too 
>> long.
>
> inhchildpartitioned indeed seems long.
>
> Since we storing if the child table (one with the OID inhrelid) is
> partitioned, inhpartitioned seems best to me.  Will implement that.

inhchildpartitioned is long but clearly tells that the child table is
partitioned, not the parent. pg_inherit can have parents which are not
partitioned, so it's better to have self-explanatory catalog name. I
am fine with some other name as long as it's clear.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-16 Thread Amit Kapila
On Tue, Aug 15, 2017 at 7:16 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Aug 15, 2017 at 7:31 AM, Amit Kapila  wrote:
>>> Attached patch fixes the issue for me.  I have locally verified that
>>> the gather merge gets executed in rescan path.  I haven't added a test
>>> case for the same as having gather or gather merge on the inner side
>>> of join can be time-consuming.  However, if you or others feel that it
>>> is important to have a test to cover this code path, then I can try to
>>> produce one.
>
>> Committed.
>
>> I believe that between this commit and the test-coverage commit from
>> Andres, this open item is reasonably well addressed.  If someone
>> thinks more needs to be done, please specify.  Thanks.
>
> How big a deal do we think test coverage is?  It looks like
> ExecReScanGatherMerge is identical logic to ExecReScanGather,
> which *is* covered according to coverage.postgresql.org, but
> it wouldn't be too surprising if they diverge in future.
>
> I should think it wouldn't be that expensive to create a test
> case, if you already have test cases that invoke GatherMerge.
> Adding a right join against a VALUES clause with a small number of
> entries, and a non-mergeable/hashable join clause, ought to do it.
>

I have done some experiments based on this idea to generate a test,
but I think it is not as straightforward as it appears.  Below are
some of my experiments:

Query that uses GatherMerge in regression tests
-

regression=# explain (costs off)   select  string4, count((unique2))
from tenk1 group by string4 order by string4;
 QUERY PLAN

 Finalize GroupAggregate
   Group Key: string4
   ->  Gather Merge
 Workers Planned: 2
 ->  Partial GroupAggregate
   Group Key: string4
   ->  Sort
 Sort Key: string4
 ->  Parallel Seq Scan on tenk1
(9 rows)

Modified Query
--
regression=# explain (costs off) select  tenk1.hundred,
count((unique2)) from tenk1 right join (values (1,100), (2,200)) as t
(two, hundred) on t.two
> 1 and tenk1.hundred > 1 group by tenk1.hundred order by tenk1.hundred;
QUERY PLAN
--
 Sort
   Sort Key: tenk1.hundred
   ->  HashAggregate
 Group Key: tenk1.hundred
 ->  Nested Loop Left Join
   Join Filter: ("*VALUES*".column1 > 1)
   ->  Values Scan on "*VALUES*"
   ->  Gather
 Workers Planned: 4
 ->  Parallel Index Scan using tenk1_hundred on tenk1
   Index Cond: (hundred > 1)
(11 rows)

The cost of GatherMerge is always higher than Gather in this case
which is quite obvious as GatherMerge has to perform some additional
task. I am not aware of a way such that Grouping and Sorting can be
pushed below parallel node (Gather/GatherMerge) in this case, if there
is any such possibility, then it might prefer GatherMerge.

Another way to make it parallel is, add a new guc enable_gather
similar to enable_gathermerge and then set that to off, it will prefer
GatherMerge in that case.  I think it is anyway good to have such a
guc.  I will go and do it this way unless you have a better idea.

Note - enable_gathermerge is not present in postgresql.conf. I think
we should add it in the postgresql.conf.sample file.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-16 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Aug 15, 2017 at 11:37 PM, Piotr Stefaniak
>  wrote:
>> At the very least, I think timestamptz_in() should either complain about
>> being called outside of transaction or return the expected value,
>> because returning year 2000 is unuseful at best. I would also like to
>> become able to do what I'm doing in a less hacky way (assuming there
>> isn't one already but I may be wrong), perhaps once there is a new
>> 'furthest' setting for recovery_target or when recovery_target_time =
>> 'now' works as I expected.

> Hm. I think that the most simple solution here would be to change
> GetCurrentDateTime() and GetCurrentTimeUsec() so as they use
> GetCurrentTimestamp() instead of the transaction-level equivalents if
> those code paths are invoked outside of a transaction.

-1 ... every datatype I/O function is entitled to assume it's being
invoked inside a transaction.  I do not think we should break that
on a case-by-case basis.  So using timestamptz_in directly in xlog.c
was a bad idea, and we need to rethink that.

The semantic problem here is that if "now" is defined to mean start
of current transaction --- which it is --- then that's meaningless
outside a transaction.  Redefining it as "start of current transaction,
unless we're not inside in a transaction, in which case something else"
isn't really an improvement.  I think we'd be much better off throwing
an error for this case.

Not sure offhand how to mechanize that given the current code
structure.  We don't want to duplicate all the infrastructure
used by timestamptz_in, certainly ... but how much of that
code is depending in some way on being inside a transaction?
I doubt that this issue with "now" is the only soft spot.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

2017-08-16 Thread Dilip Kumar
On Wed, Jul 26, 2017 at 10:35 PM, Robert Haas  wrote:
> On Thu, Jun 8, 2017 at 10:44 AM, Dilip Kumar  wrote:
>> On Thu, May 18, 2017 at 8:07 PM, Robert Haas  wrote:
>>

Thanks for the feedback.  I haven't yet worked on optimizer side
feedback but I have done some work for improving the executor part,
please find my comment inline.

> There are two main considerations here.  One, it's better to lossify
> pages with many bits set than with few bits set, because the
> additional work we thereby incur is less.  Two, it's better to lossify
> pages that are in the same chunk as other pages which we are also
> going to lossify, because that's how we actually save memory.  The
> current code will cheerfully lossify a chunk that contains no other
> pages, or will lossify one page from a chunk but not the others,
> saving no memory but hurting performance.
>
> Maybe the simplest change here would be to make it so that when we
> decide to lossify a chunk, we lossify all pages in the chunk, but only
> if there's more than one.  In other words, given a proposed page P to
> lossify, probe the hash table for all keys in the same chunk as P and
> build up a words[] array for the proposed chunk.  If that words[]
> array will end up with only 1 bit set, then forget the whole thing;
> otherwise, delete all of the entries for the individual pages and
> insert the new chunk instead.

I have attempted a very simple POC with this approach just to see how
many lossy pages we can save if we lossify all the pages falling in
the same chunk first, before moving to the next page.  I have taken
some data on TPCH scale 20 with different work_mem (only calculated
lossy pages did not test performance).  I did not see a significant
reduction in lossy pages.  (POC patch attached with the mail in case
someone is interested to test or more experiment).

64MB

TPCH QueryHead Lossy_pages   Patch Lossy_pages
lossy_page_reduce
Q6   534984529745
  5239
Q15 535072 529785 5287
Q20   1586933 1584731 2202

40MB
TPCH Query   Head Lossy_pages  Patch Lossy_pages   lossy_page_reduce
Q6  995223  993490
  1733
Q14337894   332890
 5004
Q15995417   993511
 1906
Q20  1654016 1652982
   1034

20MB
TPCH QueryHead Lossy_pagesPatch Lossy_pages
lossy_page_reduce
Q4166551 165280
   1271
Q5330679 330028
 651
Q6   1160339   1159937
   402
Q14   666897666032
   865
Q15 1160518   1160017
  501
Q20 1982981   1982828
 153


> As far as the patch itself is concerned, tbm_calculate_exact_pages()
> is computing the number of "exact pages" which will fit into the
> TIDBitmap, but I think that instead of tbm_calculate_exact_pages() you
> should have something like tbm_calculate_entries() that just returns
> nbuckets, and then let the caller work out how many entries are going
> to be exact and how many are going to be inexact.  An advantage of
> that approach is that the new function could be used by tbm_create()
> instead of duplicating the logic.

Ok
  I'm not sure that the way you are
> doing the rest of the calculation is wrong, but I've got no confidence
> that it's right, either: the way WORDS_PER_CHUNK is used looks pretty
> random, and the comments aren't enough for me to figure it out.

+ * Eq1: nbuckets = exact_bucket + lossy_buckets
+ * Eq2: total_pages = exact_bucket + (lossy_buckets * WORDS_PER_CHUNK)

I have derived my formulae based on these two equations.  But, it
assumes that all the lossy_buckets(chunk) will hold a WORDS_PER_CHUNK
number of pages, which seems very optimistic.

>
> It's unclear what assumptions we should make while trying to estimate
> the number of lossy pages.  The effectiveness of lossification depends
> on the average number of pages that get folded into a chunk; but how
> many will that be?  If we made some of the improvements proposed
> above, it would probably be higher than it is now, but either way it's
> not clear what number to use.  One possible assumption is that the
> pages that get lossified are exactly average, so:
>
> double entries_saved_per_lossy_page = Max(heap_pages_fetched /
> tbm_max_entries - 1, 

Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-16 Thread Michael Paquier
On Tue, Aug 15, 2017 at 11:37 PM, Piotr Stefaniak
 wrote:
> At the very least, I think timestamptz_in() should either complain about
> being called outside of transaction or return the expected value,
> because returning year 2000 is unuseful at best. I would also like to
> become able to do what I'm doing in a less hacky way (assuming there
> isn't one already but I may be wrong), perhaps once there is a new
> 'furthest' setting for recovery_target or when recovery_target_time =
> 'now' works as I expected.

Hm. I think that the most simple solution here would be to change
GetCurrentDateTime() and GetCurrentTimeUsec() so as they use
GetCurrentTimestamp() instead of the transaction-level equivalents if
those code paths are invoked outside of a transaction. Any code using
those routines would get stupid timestamps anyway if they try to use
keywords like 'today', 'now' or 'tomorrow' so that does not sound like
a bad change to me.

And yes, that's clearly a bug. Nice discovery.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra Vietnamese unaccent rules

2017-08-16 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Aug 17, 2017 at 6:01 AM, Tom Lane  wrote:
>> I'm not really qualified to review the Python coding
>> style, but I did fix a typo in a comment.

> No pythonist here, but a large confusing "if" condition without any
> comments is better if split up and explained with comments if that can
> help in clarifying what the code is doing in any language, so thanks
> for keeping the code intact.

Certainly agreed on splitting up the logic into multiple statements.
I just meant that I don't know enough Python to know if there are
better ways to do these tests.  (It probably doesn't matter, since
performance of this script is not an issue, and it's not likely to
undergo a lot of further development either.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM salt length

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 12:10 AM, Aleksander Alekseev
 wrote:
>> The SCRAM salt length is currently set as
>>
>> /* length of salt when generating new verifiers */
>> #define SCRAM_DEFAULT_SALT_LEN 12
>>
>> without further comment.
>>
>> I suspect that this length was chosen based on the example in RFC 5802
>> (SCRAM-SHA-1) section 5.  But the analogous example in RFC 7677
>> (SCRAM-SHA-256) section 3 uses a length of 16.  Should we use that instead?

In the initial discussions there was as well a mention about using 16 bytes.
https://www.postgresql.org/message-id/507550bd.2030...@vmware.com
As we are using SCRAM-SHA-256, let's bump it up and be consistent.
That's now or never.

> Maybe this length was chosen just because it becomes a 16-characters
> string after base64encode. If I understand correctly RFC 5802 and RFC
> 7677 don't say much about the required or recommended length of the
> salt.

Yep, it doesn't provide any recommendation.

> I personally believe that 2^96 of possible salts is consistent with both
> RFCs and should be enough in practice.

(12 bytes * 8) = 96, so you would favor 12 as length.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Amit Langote
On 2017/08/17 11:22, Robert Haas wrote:
> On Wed, Aug 16, 2017 at 10:12 PM, Amit Langote
>  wrote:
>>> In the catalogs we are using full "partitioned" e.g. pg_partitioned_table. 
>>> May
>>> be we should name the column as "inhchildpartitioned".
>>
>> Sure.
> 
> I suggest inhpartitioned or inhispartition.  inhchildpartitioned seems too 
> long.

inhchildpartitioned indeed seems long.

Since we storing if the child table (one with the OID inhrelid) is
partitioned, inhpartitioned seems best to me.  Will implement that.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-08-16 Thread Stephen Frost
Noah, all,

On Wed, Aug 16, 2017 at 22:24 Noah Misch  wrote:

> On Tue, Aug 15, 2017 at 10:24:34PM +0200, Tobias Bussmann wrote:
> > I've tested the new \gx against 10beta and current git HEAD. Actually
> one of my favourite features of PostgreSQL 10! However in my environment it
> was behaving strangely. After some debugging I found that \gx does not work
> if you have \set FETCH_COUNT n before. Please find attached a patch that
> fixes this incl. new regression test.
>
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Stephen,
> since you committed the patch believed to have created it, you own this
> open
> item.  If some other commit is more relevant or if this does not belong as
> a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days
> of
> this message.  Include a date for your subsequent status update.  Testers
> may
> discover new open items at any time, and I want to plan to get them all
> fixed
> well in advance of shipping v10.  Consequently, I will appreciate your
> efforts
> toward speedy resolution.  Thanks.


I'm in Boston this week but should be able to address this no later than
Monday, August 21st,  and I will provide an update then.

Thanks!

Stephen


Re: [HACKERS] taking stdbool.h into use

2017-08-16 Thread Michael Paquier
On Wed, Aug 16, 2017 at 11:20 PM, Tom Lane  wrote:
> Don't know how far back you need to go to find Windows machines
> with 4-byte bool, but we have some pretty long-in-the-tooth
> buildfarm critters in that lineage, too.

>From VS 2003 and upwards the size has always been 1:
https://msdn.microsoft.com/en-us/library/s3f49ktz.aspx
So this gives some margin as the buildfarm uses VS 2008 and newer versions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-08-16 Thread Noah Misch
On Tue, Aug 15, 2017 at 10:24:34PM +0200, Tobias Bussmann wrote:
> I've tested the new \gx against 10beta and current git HEAD. Actually one of 
> my favourite features of PostgreSQL 10! However in my environment it was 
> behaving strangely. After some debugging I found that \gx does not work if 
> you have \set FETCH_COUNT n before. Please find attached a patch that fixes 
> this incl. new regression test.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Stephen,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 10:12 PM, Amit Langote
 wrote:
>> In the catalogs we are using full "partitioned" e.g. pg_partitioned_table. 
>> May
>> be we should name the column as "inhchildpartitioned".
>
> Sure.

I suggest inhpartitioned or inhispartition.  inhchildpartitioned seems too long.

> There are a bunch of callers of find_all_inheritors() and
> find_inheritance_children.  Changes to make them all declare a pointless
> variable seemed off to me.  The conditional in question doesn't seem to be
> that expensive.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Amit Langote
Hi Ashutosh,

Thanks for the review and the updated patch.

On 2017/08/16 21:48, Ashutosh Bapat wrote:
> On Wed, Aug 16, 2017 at 11:06 AM, Amit Langote
>  wrote:
> 
>>
>>> This patch series is blocking a bunch of other things, so it would be
>>> nice if you could press forward with this quickly.
>>
>> Attached updated patches.
>>
> 
> Review for 0001. The attached patch has some minor changes to the
> comments and code.
> 
> + * All the relations in the partition tree (including 'rel') must have been
> + * locked (using at least the AccessShareLock) by the caller.
>
> It would be good if we can Assert this in the function. But I couldn't find a
> way to check whether the backend holds a lock of required strength. Is there
> any?

Currently there isn't.  Robert suggested a RelationLockHeldByMe(Oid) [1],
which is still a TODO on my plate.

> /*
> -* We locked all the partitions above including the leaf partitions.
> -* Note that each of the relations in *partitions are eventually
> -* closed by the caller.
> +* All the partitions were locked above.  Note that the relcache
> +* entries will be closed by ExecEndModifyTable().
>  */
> I don't see much value in this hunk,

I thought the new text was a bit clearer, but maybe that's just me.  Will
remove.

> +   list_free(all_parts);
> ExecSetupPartitionTupleRouting() will be called only once per DML statement.
> Leaking the memory for the duration of DML may be worth the time spent
> in the traversing
> the list and freeing each cell independently.

Fair enough, will remove the list_free().

> 0002 review
> +
> + 
> +  inhchildparted
> +  bool
> +  
> +  
> +   This is true if the child table is a partitioned table,
> +   false otherwise
> +  
> + 
> In the catalogs we are using full "partitioned" e.g. pg_partitioned_table. May
> be we should name the column as "inhchildpartitioned".

Sure.

> +#define OID_CMP(o1, o2) \
> +   ((o1) < (o2) ? -1 : ((o1) > (o2) ? 1 : 0));
> Instead of duplicating the logic in this macro and oid_cmp(), we may want to
> call this macro in oid_cmp()? Or simply call oid_cmp() from inhchildinfo_cmp()
> passing pointers to the OIDs; a pointer indirection would be costly, but still
> maintainable.

Actually, I avoided using oid_cmp exactly for that reason.

> +   if (num_partitioned_children)
> +   *num_partitioned_children = my_num_partitioned_children;
> +
> Instead of this conditional, why not to make every caller pass a pointer to
> integer. The callers will just ignore the value if they don't want it. If we 
> do
> this change, we can get rid of my_num_partitioned_children variable and
> directly update the passed in pointer variable.

There are a bunch of callers of find_all_inheritors() and
find_inheritance_children.  Changes to make them all declare a pointless
variable seemed off to me.  The conditional in question doesn't seem to be
that expensive.  (To be fair, the one introduced in find_all_inheritors()
kind of is as implemented by the patch, because it's executed for every
iteration of the foreach(l, rels_list) loop, which I will fix.)

> 
> inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid;
> -   if (numoids >= maxoids)
> +   is_partitioned = ((Form_pg_inherits)
> +   GETSTRUCT(inheritsTuple))->inhchildparted;
> Now that we are fetching two members from Form_pg_inherits structure, may be 
> we
> should use a local variable
> Form_pg_inherits inherits_tuple = GETSTRUCT(inheritsTuple);
> and use that to fetch its members.

Sure, will do.

> I am still reviewing changes in this patch.

Okay, will wait for more comments before sending the updated patches.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BTgmobwbh12OJerqAGyPEjb_%2B2y7T0nqRKTcjed6L4NTET6Fg%40mail.gmail.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting huge pages on Windows

2017-08-16 Thread Thomas Munro
On Wed, Apr 12, 2017 at 7:08 PM, Tsunakawa, Takayuki
 wrote:
> Oh, I got it now.  Thanks.  The revised patch is attached.  The only modified 
> file is pg_ctl.c.  The patch worked as expected.
>
> It is regrettable that I could not make it in time for PG 10, but I'd 
> appreciate it if you could review and commit this patch early in PG 11 while 
> our memory is fresh.  Thank you for your patience.  I'll create an entry in 
> the next CF soon.

Hi Takayuki-san,

win_large_pages_v13.patch seems to be the latest version posted, but
it no longer applies.  Could you please post a rebased version?

Thanks!

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Craig Ringer
On 17 August 2017 at 09:33, Andres Freund  wrote:

> On 2017-08-16 21:25:48 -0400, Robert Haas wrote:
> > On Wed, Aug 16, 2017 at 5:55 PM, Andres Freund 
> wrote:
> > > I think we should constrain the API to only allow later LSNs than
> > > currently in the slot, rather than arbitrary ones. That's why I was
> > > thinking of "forward".  I'm not convinced it's a good / safe idea to
> > > allow arbitrary values to be set.
> >
> > Maybe I shouldn't play the devil's advocate here, but isn't a feature
> > like this by definition only for people who Know What They Are Doing?
> > If so, why not let them back the slot up?  I'm sure that will work out
> > just fine.  They Know What They Are Doing.
>
> I have yet to hear a reason for allowing to move things backward etc. So
> I'm not sure what the benefit would be. But more importantly I'd like to
> make this available to non-superusers at some point, and there I think
> it's more important that they can't do bad things. The reason for
> allowing it for non-superusers is that I think it's quite a useful
> function to be used by an automated system. E.g. to ensure enough, but
> not too much, WAL is available for a tertiary standby both on the actual
> primary and a failover node.
>

I strongly agree.

If you really need to move a physical slot back (why?) you can do it with
an extension that uses the low level APIs. But I can't see why you would
want to.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Extra Vietnamese unaccent rules

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 6:01 AM, Tom Lane  wrote:
> Pushed into v11.

Thanks.

> I'm not really qualified to review the Python coding
> style, but I did fix a typo in a comment.

No pythonist here, but a large confusing "if" condition without any
comments is better if split up and explained with comments if that can
help in clarifying what the code is doing in any language, so thanks
for keeping the code intact.

> BTW, while this isn't a reason to delay this patch, I wonder whether
> the regression test for unaccent is really adequate.  According to
> https://coverage.postgresql.org/contrib/unaccent/unaccent.c.gcov.html
> it isn't doing anything to check multicharacter source strings, and
> what's considerably more disturbing, it isn't exercising the PG_CATCH
> code that's meant to deal with characters outside the current database's
> encoding.

Yeah, that could be improved a bit.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Tatsuo Ishii
> On Thu, Aug 17, 2017 at 9:15 AM, Tatsuo Ishii  wrote:
>>> He meant logical replication,
>>
>> Oh I could not find he meant logical replication in the original
>> report.
> 
> The second message of the thread says so, but the first does not
> mention logical replication at all.
>>From here are mentioned PG 9.6 and pg_basebackup:
> https://www.postgresql.org/message-id/cahbggj8g2t+zdcacz2fmzx9ctxkwjkbshd6nkyb4i9ojf6k...@mail.gmail.com.
> Explaining the confusion.

Oh, I see it now. Thanks.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken link to DocBook XSL Stylesheets

2017-08-16 Thread Masahiko Sawada
On Thu, Aug 17, 2017 at 3:52 AM, Peter Eisentraut
 wrote:
> On 6/20/17 22:20, Masahiko Sawada wrote:
>> Hi,
>>
>> In J.2. Tool Sets section of documentation, there is a link to DocBook
>> XSL Stylesheets but that link seems no longer available. I got 404
>> error.
>>
>> J.2. Tool Sets
>> 
>>
>> As a updated next link I guess we can use the following link.
>> 
>
> Fixed.

Thank you for picking up it and committing!

> (I only backpatched this to PG10, because with older versions you then
> get to the issue that the DSSSL stylesheets don't appear to have any
> website at all anymore.  So better leave all of that rot consistently. ;-) )

Agreed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Andres Freund
On 2017-08-16 21:25:48 -0400, Robert Haas wrote:
> On Wed, Aug 16, 2017 at 5:55 PM, Andres Freund  wrote:
> > I think we should constrain the API to only allow later LSNs than
> > currently in the slot, rather than arbitrary ones. That's why I was
> > thinking of "forward".  I'm not convinced it's a good / safe idea to
> > allow arbitrary values to be set.
> 
> Maybe I shouldn't play the devil's advocate here, but isn't a feature
> like this by definition only for people who Know What They Are Doing?
> If so, why not let them back the slot up?  I'm sure that will work out
> just fine.  They Know What They Are Doing.

I have yet to hear a reason for allowing to move things backward etc. So
I'm not sure what the benefit would be. But more importantly I'd like to
make this available to non-superusers at some point, and there I think
it's more important that they can't do bad things. The reason for
allowing it for non-superusers is that I think it's quite a useful
function to be used by an automated system. E.g. to ensure enough, but
not too much, WAL is available for a tertiary standby both on the actual
primary and a failover node.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 9:15 AM, Tatsuo Ishii  wrote:
>> He meant logical replication,
>
> Oh I could not find he meant logical replication in the original
> report.

The second message of the thread says so, but the first does not
mention logical replication at all.
>From here are mentioned PG 9.6 and pg_basebackup:
https://www.postgresql.org/message-id/cahbggj8g2t+zdcacz2fmzx9ctxkwjkbshd6nkyb4i9ojf6k...@mail.gmail.com.
Explaining the confusion.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 5:55 PM, Andres Freund  wrote:
> I think we should constrain the API to only allow later LSNs than
> currently in the slot, rather than arbitrary ones. That's why I was
> thinking of "forward".  I'm not convinced it's a good / safe idea to
> allow arbitrary values to be set.

Maybe I shouldn't play the devil's advocate here, but isn't a feature
like this by definition only for people who Know What They Are Doing?
If so, why not let them back the slot up?  I'm sure that will work out
just fine.  They Know What They Are Doing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-16 Thread Amit Langote
Hi Amit,

Thanks for the comments.

On 2017/08/16 20:30, Amit Khandekar wrote:
> On 16 August 2017 at 11:06, Amit Langote  
> wrote:
> 
>> Attached updated patches.
> 
> Thanks Amit for the patches.
> 
> I too agree with the overall approach taken for keeping the locking
> order consistent: it's best to do the locking with the existing
> find_all_inheritors() since it is much cheaper than to lock them in
> partition-bound order, the later being expensive since it requires
> opening partitioned tables.

Yeah.  Per the Robert's design idea, we will always do the *locking* in
the order determined by find_all_inheritors/find_inheritance_children.

>> I haven't yet done anything about changing the timing of opening and
>> locking leaf partitions, because it will require some more thinking about
>> the required planner changes.  But the above set of patches will get us
>> far enough to get leaf partition sub-plans appear in the partition bound
>> order (same order as what partition tuple-routing uses in the executor).
> 
> So, I believe none of the changes done in pg_inherits.c are essential
> for expanding the inheritence tree in bound order, right ?

Right.

The changes to pg_inherits.c are only about recognizing partitioned tables
in an inheritance hierarchy and putting them ahead in the returned list.
Now that I think of it, the title of the patch (teach pg_inherits.c about
"partitioning") sounds a bit confusing.  In particular, the patch does not
teach it things like partition bound order, just that some tables in the
hierarchy could be partitioned tables.

> I am not
> sure whether we are planning to commit these two things together or
> incrementally :
> 1. Expand in bound order
> 2. Allow for locking only the partitioned tables first.
> 
> For #1, the changes in pg_inherits.c are not required (viz, keeping
> partitioned tables at the head of the list, adding inhchildparted
> column, etc).

Yes.  Changes to pg_inherits.c can be committed completely independently
of either 1 or 2, although then there would be nobody using that capability.

About 2: I think the capability to lock only the partitioned tables in
expand_inherited_rtentry() will only be used once we have the patch to do
the necessary planner restructuring that will allow us to defer child
table locking to some place that is not expand_inherited_rtentry().  I am
working on that patch now and should be able to show something soon.

> If we are going to do #2 together with #1, then in the patch set there
> is no one using the capability introduced by #2. That is, there are no
> callers of find_all_inheritors() that make use of the new
> num_partitioned_children parameter. Also, there is no boolean
> parameter  for find_all_inheritors() to be used to lock only the
> partitioned tables.
> 
> I feel we should think about
> 0002-Teach-pg_inherits.c-a-bit-about-partitioning.patch later, and
> first get the review done for the other patches.

I think that makes sense.

> I see that RelationGetPartitionDispatchInfo() now returns quite a
> small subset of what it used to return, which is good. But I feel for
> expand_inherited_rtentry(), RelationGetPartitionDispatchInfo() is
> still a bit heavy. We only require the oids, so the
> PartitionedTableInfo data structure (including the pd->indexes array)
> gets discarded.

Maybe we could make the output argument optional, but I don't see much
point in being too conservative here.  Building the indexes array does not
cost us that much and if a not-too-distant-in-future patch could use that
information somehow, it could do so for free.

> Also, RelationGetPartitionDispatchInfo() has to call get_rel_relkind()
> for each child. In expand_inherited_rtentry(), we anyway have to open
> all the child tables, so we get the partition descriptors for each of
> the children for free. So how about, in expand_inherited_rtentry(), we
> traverse the partition tree using these descriptors similar to how it
> is traversed in RelationGetPartitionDispatchInfo() ? May be to avoid
> code duplication for traversing, we can have a common API.

As mentioned, one goal I'm seeking is to avoid having to open the child
tables in expand_inherited_rtentry().

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 9:19 AM, Craig Ringer  wrote:
> pg_advance_replication_slot(...)
>
> ERROR's on logical slot, for now. Physical slots only.
>
> Forward-only.
>
> Future work to allow it to use the logical decoding infrastructure to
> fast-forward a slot by reading only catalog change information and
> invalidations, either via a dummy output plugin that filters out all xacts,
> or by lower level use of the decoding code.
>
> Reasonable?

Yes. I did not imply logical slots in my previous message, sorry if my
words were incomplete.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix number skipping in to_number

2017-08-16 Thread Thomas Munro
On Thu, Aug 10, 2017 at 10:21 PM, Oliver Ford  wrote:
> Prevents an issue where numbers can be skipped in the to_number()
> function when the format mask contains a "G" or a "," but the input
> string doesn't contain a separator. This resolves the TODO item "Fix
> to_number() handling for values not matching the format string".

>From the TODO I found the previous discussion here:

https://www.postgresql.org/message-id/flat/be46a4f30909212157o71dc82bep7e074f9fa7eb1d14%40mail.gmail.com

There were some votes against changing it and some votes for.  I don't
see what's good about the current behaviour.  It's hard (though not
impossible) to imagine that someone is depending on that weirdness,
and it's harder to believe that anyone wants that behaviour knowingly.
For people migrating from Oracle or writing application portable to
both, the current behaviour sucks.  It's not a SQL standard feature,
it's a be-like-Oracle feature, so it should be like Oracle.  Aside
from that, by the principle of least astonishment alone I think it's
pretty clear that "a separator goes here" shouldn't mean "eat a
digit".

So +1 from me.

> == Change ==
>
> Currently, if the format mask in to_number() has a "G" or a ",", it
> will assume that the input string contains a separator character in
> the same place. If however a number is there instead, that number will
> be silently skipped and not appear in the output. So we get:
>
> select to_number('34,50','999,99');
>  to_number
> ---
>340
> (1 row)
>
> This patch checks the input string when it encounters a "G" or "," in
> the format mask. If the separator character is found, the code moves
> over it as normal. If it's not found, then the code no longer
> increments the relevant pointer so as not to skip the character. After
> the patch, we get the correct result:
>
> select to_number('34,50','999,99');
>  to_number
> ---
>   3450
> (1 row)
>
> This is in line with Oracle's result.

In other words the separators are completely ignored.  Presumably the
only reason you'd have them at all is so that you could use the same
format string in to_number() and to_char() calls to do round-trip
conversions.  That seems reasonable to me.

> == Rationale ==
>
> This patch is a small change, which leaves PostgreSQL behavior
> different from Oracle behavior in related cases. Oracle's
> implementation seems to read from right-to-left, and raises an
> "ORA-01722: invalid number" error if there are digits in the input
> string which don't have corresponding characters in the format mask. I
> have chosen not to throw such errors, because there are use cases for
> only returning part of a number string. For example, the following is
> an error on Oracle:
>
> select to_number('123,000', '999G') from dual;
> Error report -
> SQL Error: ORA-01722: invalid number
>
> But if you wanted to only take the characters before the comma, and
> discard the thousands part, you can do so on PostgreSQL with:
>
> select to_number('123,000', '999G');
>  to_number
> ---
>123
> (1 row)

I can see that it's hard to change this one, because it's more likely
to break existing applications that were written by people who didn't
think of the format string as limiting the maximum size of the number.
I think someone could argue for changing that too, but I agree with
your choice for this patch.

> == Testing ==
>
> Tested on Windows with MinGW using the latest checkout from master.
> Added regression tests to check for this new behavior. All existing
> tests pass.

The tests you added pass for me but the int8 test now fails with the
following (this is from regression.diff after running
'PG_REGRESS_DIFF_OPTS="-dU10" make check').  It looks like some new
whitespace is appearing on the right in the output of to_char().  I
didn't try to understand why.

@@ -453,34 +453,34 @@
 --+--
  4567890123456789 | 4567890123456789
 (1 row)

 -- TO_CHAR()
 --
 SELECT '' AS to_char_1, to_char(q1, '9G999G999G999G999G999'),
to_char(q2, '9,999,999,999,999,999')
FROM INT8_TBL;
  to_char_1 |to_char |to_char
 ---++
-   |123 |456
+   |123 |   456
|123 |  4,567,890,123,456,789
-   |  4,567,890,123,456,789 |123
+   |  4,567,890,123,456,789 |   123
|  4,567,890,123,456,789 |  4,567,890,123,456,789
|  4,567,890,123,456,789 | -4,567,890,123,456,789
 (5 rows)

 SELECT '' AS to_char_2, to_char(q1, '9G999G999G999G999G999D999G999'),
to_char(q2, '9,999,999,999,999,999.999,999')
FROM INT8_TBL;
  to_char_2 |to_char |to_char
 ---++
-   |123.000,000 |   

Re: [HACKERS] [COMMITTERS] pgsql: Include foreign tables in information_schema.table_privileges

2017-08-16 Thread Peter Eisentraut
On 8/16/17 07:27, Robert Haas wrote:
> On Tue, Aug 15, 2017 at 7:41 PM, Peter Eisentraut  wrote:
>> Include foreign tables in information_schema.table_privileges
>>
>> This appears to have been an omission in the original commit
>> 0d692a0dc9f.  All related information_schema views already include
>> foreign tables.
>>
>> Reported-by: Nicolas Thauvin 
> 
> I would have been disinclined to back-patch this.  Somebody could be
> depending on the current behavior, and now we'll end up with two kinds
> of clusters, one where initdb was done before these changes and
> another where it was done afterward.  So nobody will be able to count
> on this behavior one way or the other.

I understand that.  But this is an additive change, so to speak, so
whatever behavior someone was relying on should continue to work.

I did not backpatch 9b5140fb503eb50634cd7e080d41f4d9af41e0a6 because
that is a behavior-breaking change.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2017-08-16 Thread Craig Ringer
On 16 August 2017 at 23:14, Robert Haas  wrote:

> On Tue, Aug 15, 2017 at 4:36 PM, Tomas Vondra
>  wrote:
> > You might say that people investigating issues in this area of code
> should
> > be aware of how HEAP_XMIN_FROZEN is defined, and perhaps you're right ...
>
> Yes, I think that's what I would say.  I mean, if you happen to NOT
> know that committed|invalid == frozen, but you DO know what committed
> means and what invalid means, then you're going to be *really*
> confused when you see committed and invalid set on the same tuple.
> Showing you frozen has got to be clearer.
>
> Now, I agree with you that a test like (enumval_tup->t_data &
> HEAP_XMIN_COMMITTED) could be confusing to someone who doesn't realize
> that HEAP_XMIN_FROZEN & HEAP_XMIN_COMMITTED == HEAP_XMIN_COMMITTED,
> but I think that's just one of those things that unfortunately is
> going to require adequate knowledge for people investigating issues.
> If there's an action item there, it might be to try to come up with a
> way to make the source code clearer.
>
>
For other multi-purpose flags we have macros, and I think it'd make sense
to use them here too.

Eschew direct use of  HEAP_XMIN_COMMITTED, HEAP_XMIN_INVALID and
HEAP_XMIN_FROZEN in tests. Instead, consistently use HeapXminIsFrozen(),
HeapXminIsCommitted(), and HeapXminIsInvalid() or something like that.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Craig Ringer
On 17 August 2017 at 07:30, Michael Paquier 
wrote:

>
> Definitely agreed on that. Any move function would need to check if
> the WAL position given by caller is already newer than what's
> available in the local pg_wal (minimum of all other slots), with a
> shared lock that would need to be taken by xlog.c when recycling past
> segments. A forward function works on a single entry, which should be
> disabled at the moment of the update. It looks dangerous to me to do
> such an operation if there is a consumer of a slot currently on it.
>
>
pg_advance_replication_slot(...)

ERROR's on logical slot, for now. Physical slots only.

Forward-only.

Future work to allow it to use the logical decoding infrastructure to
fast-forward a slot by reading only catalog change information and
invalidations, either via a dummy output plugin that filters out all xacts,
or by lower level use of the decoding code.

Reasonable?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Tatsuo Ishii
> He meant logical replication,

Oh I could not find he meant logical replication in the original
report.

> but the code in question here is the same
> for streaming replication, or whatever it's called.

Yes.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread David Rowley
On 17 August 2017 at 01:20, Heikki Linnakangas  wrote:
> Looks reasonable. I edited the comments and the variable names a bit, to my
> liking, and committed. Thanks!

Thanks for committing. I've just been catching up on all that went on
while I was sleeping. Thanks for handling the cleanup too.

I'll feel better once pademelon goes green again. From looking at the
latest failure on it, it appears that your swapping of
pg_atomic_write_u64 for pg_atomic_init_u64 should fix this. My
apologies for that mistake.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Tatsuo Ishii
> In practice it's largely about architecture, I think.  You definitely
> need the same endianness and floating-point format, which are hardware,
> and you need the same MAXALIGN, which is partly hardware but in principle
> could be chosen differently in different ABI conventions for the same
> hardware.  (At least for non-alignment-picky hardware like Intel.
> Alignment-picky hardware is likely going to dictate that choice too.)
> 
> We disclaim cross-OS portability partly because of the possible influence
> of different ABI conventions, but it doesn't surprise me in the least if
> in practice 64-bit Windows can interoperate with x86_64 Linux.  (Less sure
> about the 32-bit case --- it looks like pg_config.h.win32 chooses
> MAXALIGN 8 in all cases, which would mean 32-bit Windows is more likely to
> interoperate with 64-bit Linux than 32-bit Linux.)

I feel like we need to be a little bit more cleaner about this
in our official documentation. From section 26.2.1. Planning:

"Hardware need not be exactly the same, but experience shows that
maintaining two identical systems is easier than maintaining two
dissimilar ones over the lifetime of the application and system. In
any case the hardware architecture must be the same ― shipping from,
say, a 32-bit to a 64-bit system will not work."

Probably We should disclaim cross-OS portability here?

> The thing that's really likely to bite you cross-platform is dependency
> of textual index sort ordering on non-C locale definitions.  But that
> wouldn't show up in a quick smoke test of replication ... and if you
> really wanted, you could use C locale on both ends.

Of course.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Subscription code improvements

2017-08-16 Thread Peter Eisentraut
On 8/8/17 05:58, Masahiko Sawada wrote:
> Are you planning to work on remaining patches 0005 and 0006 that
> improve the subscription codes in PG11 cycle? If not, I will take over
> them and work on the next CF.

Based on your assessment, the remaining patches were not required bug
fixes.  So I think preparing them for the next commit fest would be great.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Peter Eisentraut
On 8/16/17 19:41, Michael Paquier wrote:
> On Thu, Aug 17, 2017 at 8:13 AM, Tatsuo Ishii  wrote:
>> Do we allow streaming replication among different OS?
> 
> No. WAL is a binary format.
> 
>> I thought it is
>> required that primary and standbys are same platform (in my
>> understanding this means the same hardware architecture and OS) in
>> streaming replication.
> 
> Yep, I recall the same requirement.

He meant logical replication, but the code in question here is the same
for streaming replication, or whatever it's called.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Aug 17, 2017 at 8:13 AM, Tatsuo Ishii  wrote:
>> Do we allow streaming replication among different OS?

> No. WAL is a binary format.

>> I thought it is
>> required that primary and standbys are same platform (in my
>> understanding this means the same hardware architecture and OS) in
>> streaming replication.

> Yep, I recall the same requirement.

In practice it's largely about architecture, I think.  You definitely
need the same endianness and floating-point format, which are hardware,
and you need the same MAXALIGN, which is partly hardware but in principle
could be chosen differently in different ABI conventions for the same
hardware.  (At least for non-alignment-picky hardware like Intel.
Alignment-picky hardware is likely going to dictate that choice too.)

We disclaim cross-OS portability partly because of the possible influence
of different ABI conventions, but it doesn't surprise me in the least if
in practice 64-bit Windows can interoperate with x86_64 Linux.  (Less sure
about the 32-bit case --- it looks like pg_config.h.win32 chooses
MAXALIGN 8 in all cases, which would mean 32-bit Windows is more likely to
interoperate with 64-bit Linux than 32-bit Linux.)

The thing that's really likely to bite you cross-platform is dependency
of textual index sort ordering on non-C locale definitions.  But that
wouldn't show up in a quick smoke test of replication ... and if you
really wanted, you could use C locale on both ends.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't force-assign transaction id when exporting a snapshot.

2017-08-16 Thread Peter Eisentraut
On 8/14/17 13:57, Andres Freund wrote:
> On 2017-08-14 13:55:29 -0400, Peter Eisentraut wrote:
>> On 8/12/17 07:32, Petr Jelinek wrote:
>>> This commit has side effect that it makes it possible to export
>>> snapshots on the standbys. This makes it possible to do pg_dump -j on
>>> standby with consistent snapshot. Here is one line patch (+ doc update)
>>> which allows doing that when pg_dumping from PG10+.
>>>
>>> I also wonder if it should be mentioned in release notes. If the
>>> attached patch would make it into PG10 it would be no brainer to mention
>>> it as feature under pg_dump section, but exporting snapshots alone I am
>>> not sure about.
>>
>> Any other opinions on this patch?
> 
> I'd be oh so very slightly inclined to push this, including the modified
> pg_dump check. We're going to be on the hook for supporting snapshots on
> standbys anyway, unless we put in additional unnecessary error check.

committed to 10 and master

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 8:13 AM, Tatsuo Ishii  wrote:
> Do we allow streaming replication among different OS?

No. WAL is a binary format.

> I thought it is
> required that primary and standbys are same platform (in my
> understanding this means the same hardware architecture and OS) in
> streaming replication.

Yep, I recall the same requirement.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2017-08-16 Thread Thomas Munro
On Thu, Jun 29, 2017 at 12:24 PM, Thomas Munro
 wrote:
> fallocate-v5.patch

Added to commitfest so we don't lose track of this.

I'm mainly concerned about the fact that we have a way for PostgreSQL
to die that looks exactly like a bug, when really it's masking an
out-of-memory condition that a DBA or sysadmin would normally be able
to diagnose by the appearance of the OOM reaper at the window holding
a scythe.  I suppose the SIGBUS case is much more likely to happen
when we start actively using large amounts of dynamic shared memory
(parallel hash), but I suppose it could happen now if your system is
already overcommitted and a small DSM segment happens to push you over
the edge.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Michael Paquier
On Thu, Aug 17, 2017 at 7:14 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2017-08-16 17:06:42 -0400, Tom Lane wrote:
>>> If I understand what this is meant to do, maybe better
>>> pg_move_replication_slot_lsn() or pg_change_replication_slot_lsn() ?
>>> The point being that you're adjusting the LSN pointer contained
>>> in the slot, which is distinct from the slot itself.
>
>> I think we should constrain the API to only allow later LSNs than
>> currently in the slot, rather than arbitrary ones. That's why I was
>> thinking of "forward".  I'm not convinced it's a good / safe idea to
>> allow arbitrary values to be set.
>
> +1 for constraining it like that, but I don't think that's an argument
> against using "move" or "change" as the verb.  I don't like "forward"
> because that's not the right word.  The only verb senses of "forward"
> in my Mac's dictionary are "send a message on to a further destination"
> and "help to advance or promote" (the latter usage is pretty obscure IMO).
> Neither one seems applicable here.

Definitely agreed on that. Any move function would need to check if
the WAL position given by caller is already newer than what's
available in the local pg_wal (minimum of all other slots), with a
shared lock that would need to be taken by xlog.c when recycling past
segments. A forward function works on a single entry, which should be
disabled at the moment of the update. It looks dangerous to me to do
such an operation if there is a consumer of a slot currently on it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Replication to Postgres 10 on Windows is broken

2017-08-16 Thread Tatsuo Ishii
> Thank you Tom Lane,
> This patch fixes the problem.
> 
> With this patch, streaming replication started working (replication to
> Windows)
> 
> (Tested for Linux to Windows replication)

Do we allow streaming replication among different OS? I thought it is
required that primary and standbys are same platform (in my
understanding this means the same hardware architecture and OS) in
streaming replication.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Andres Freund
On 2017-08-16 18:14:45 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-08-16 17:06:42 -0400, Tom Lane wrote:
> >> If I understand what this is meant to do, maybe better
> >> pg_move_replication_slot_lsn() or pg_change_replication_slot_lsn() ?
> >> The point being that you're adjusting the LSN pointer contained
> >> in the slot, which is distinct from the slot itself.
> 
> > I think we should constrain the API to only allow later LSNs than
> > currently in the slot, rather than arbitrary ones. That's why I was
> > thinking of "forward".  I'm not convinced it's a good / safe idea to
> > allow arbitrary values to be set.
> 
> +1 for constraining it like that, but I don't think that's an argument
> against using "move" or "change" as the verb.  I don't like "forward"
> because that's not the right word.  The only verb senses of "forward"
> in my Mac's dictionary are "send a message on to a further destination"
> and "help to advance or promote" (the latter usage is pretty obscure IMO).
> Neither one seems applicable here.

I might have thinking too much with the German "version" of the word in
mind.  However looking it up now I do see "to advance or help onward;
promote" and "to advance or play a cassette, digital recording, slide
projector, etc., in the forward direction: " -
http://www.dictionary.com/browse/forward
which kind of seems to fit.  don't know how authoritative which
dictionary is considered to be...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Kenneth Marshall  writes:
> On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote:
>> ... In fact, on perusing the linked-to page
>> http://burtleburtle.net/bob/hash/doobs.html
>> Bob says specifically that taking b and c from this hash does not
>> produce a fully random 64-bit result.  He has a new one that does,
>> lookup3.c, but probably he hasn't tried to make that bit-compatible
>> with the 1997 algorithm.

> The updated hash functions that we currently use are based on Bob Jenkins
> lookup3.c and using b as the higher order bits is pretty darn good. I had
> lobbied to present the 64-bit b+c hash in the original work for similar
> reasons. We are definitely not using a lookup2.c version from 1997.

Oh --- I overlooked the bit about "Bob's 2006 update".  Really that
comment block should have been completely rewritten, rather than leaving
the original text there, especially since as it stands there are only
pointers to the old algorithm.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Andres Freund  writes:
> On August 16, 2017 3:09:27 PM PDT, Tom Lane  wrote:
>> I wonder whether it's sensible to have --enable-cassert have the effect
>> of filling memory allocated by ShmemAlloc or the DSA code with junk (as
>> palloc does) instead of leaving it at zeroes.  It's not modeling the
>> same kind of effect, since we have no shmem-freeing primitives, but
>> it might be useful for this sort of thing.

> We kind of do - crash restarts... So yes, that's probably a good idea.

Crash restart releases the shmem segment and acquires a new one,
doesn't it?  Or am I misremembering?  I thought that it did do so,
if only to make darn sure that no old processes remain connected
to shmem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-16 Thread Kenneth Marshall
On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > Attached is a quick sketch of how this could perhaps be done (ignoring
> > for the moment the relatively-boring opclass pushups).  It introduces
> > a new function hash_any_extended which differs from hash_any() in that
> > (a) it combines both b and c into the result and (b) it accepts a seed
> > which is mixed into the initial state if it's non-zero.
> 
> > Comments?
> 
> Hm.  Despite the comment at lines 302-304, I'm not sure that we ought
> to do this simply by using "b" as the high order bits.  AFAICS that
> exposes little or no additional randomness; in particular it seems
> unlikely to meet Jenkins' original design goal that "every 1-bit and
> 2-bit delta achieves avalanche".  There might be some simple way to
> extend the existing code to produce a mostly-independent set of 32 more
> bits, but I wonder if we wouldn't be better advised to just keep Jenkins'
> code as-is and use some other method entirely for producing the
> 32 new result bits.
> 
> ... In fact, on perusing the linked-to page
> http://burtleburtle.net/bob/hash/doobs.html
> Bob says specifically that taking b and c from this hash does not
> produce a fully random 64-bit result.  He has a new one that does,
> lookup3.c, but probably he hasn't tried to make that bit-compatible
> with the 1997 algorithm.
> 

Hi,

The updated hash functions that we currently use are based on Bob Jenkins
lookup3.c and using b as the higher order bits is pretty darn good. I had
lobbied to present the 64-bit b+c hash in the original work for similar
reasons. We are definitely not using a lookup2.c version from 1997.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund


On August 16, 2017 3:09:27 PM PDT, Tom Lane  wrote:
>Heikki Linnakangas  writes:
>> On 08/17/2017 12:20 AM, Tom Lane wrote:
>>> Indeed, gaur fails with
>>> 2017-08-16 17:09:38.315 EDT [13043:11] PANIC:  stuck spinlock
>detected at pg_atomic_compare_exchange_u64_impl, atomics.c:196
>
>> I was able to reproduce this locally, with --disable-atomics, but
>only 
>> after hacking it to fill the struct with garbage, before initializing
>
>> it. IOW, it failed to fail, because the spinlock happened to be 
>> initialized correctly by accident. Perhaps that's happening on
>piculet, too.
>
>Oh, right.  HPPA is unique among our platforms, I think, in that the
>"unlocked" state of a spinlock is not "all zeroes".  So if you're
>dealing
>with pre-zeroed memory, which shmem generally would be, failing to
>initialize a spinlock does not cause visible errors except on HPPA.
>
>I wonder whether it's sensible to have --enable-cassert have the effect
>of filling memory allocated by ShmemAlloc or the DSA code with junk (as
>palloc does) instead of leaving it at zeroes.  It's not modeling the
>same kind of effect, since we have no shmem-freeing primitives, but
>it might be useful for this sort of thing.

We kind of do - crash restarts... So yes, that's probably a good idea.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Tom Lane
Alvaro Herrera  writes:
> Andres Freund wrote:
>> I think we should constrain the API to only allow later LSNs than
>> currently in the slot, rather than arbitrary ones. That's why I was
>> thinking of "forward".  I'm not convinced it's a good / safe idea to
>> allow arbitrary values to be set.

> Hmm.  In terms of safety, it is safe to move the LSN backwards, as long
> as the oldest LSN across all slots is not changed -- in other words, the
> actual safe limit is the oldest of all slot LSNs, rather than the
> current position of the slot being manipulated (which is what you're
> saying).

True, but you'd need to take some kind of global lock to verify that,
whereas "move forward only" would only need to examine/change the one
slot.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Alvaro Herrera
Andres Freund wrote:
> On 2017-08-16 17:06:42 -0400, Tom Lane wrote:

> > If I understand what this is meant to do, maybe better
> > pg_move_replication_slot_lsn() or pg_change_replication_slot_lsn() ?
> > The point being that you're adjusting the LSN pointer contained
> > in the slot, which is distinct from the slot itself.
> 
> I think we should constrain the API to only allow later LSNs than
> currently in the slot, rather than arbitrary ones. That's why I was
> thinking of "forward".  I'm not convinced it's a good / safe idea to
> allow arbitrary values to be set.

Hmm.  In terms of safety, it is safe to move the LSN backwards, as long
as the oldest LSN across all slots is not changed -- in other words, the
actual safe limit is the oldest of all slot LSNs, rather than the
current position of the slot being manipulated (which is what you're
saying).

I don't know if this is useful for the use case Magnus described; TBH I
didn't understand that use case.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Tom Lane
Andres Freund  writes:
> On 2017-08-16 17:06:42 -0400, Tom Lane wrote:
>> If I understand what this is meant to do, maybe better
>> pg_move_replication_slot_lsn() or pg_change_replication_slot_lsn() ?
>> The point being that you're adjusting the LSN pointer contained
>> in the slot, which is distinct from the slot itself.

> I think we should constrain the API to only allow later LSNs than
> currently in the slot, rather than arbitrary ones. That's why I was
> thinking of "forward".  I'm not convinced it's a good / safe idea to
> allow arbitrary values to be set.

+1 for constraining it like that, but I don't think that's an argument
against using "move" or "change" as the verb.  I don't like "forward"
because that's not the right word.  The only verb senses of "forward"
in my Mac's dictionary are "send a message on to a further destination"
and "help to advance or promote" (the latter usage is pretty obscure IMO).
Neither one seems applicable here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Heikki Linnakangas  writes:
> On 08/17/2017 12:20 AM, Tom Lane wrote:
>> Indeed, gaur fails with
>> 2017-08-16 17:09:38.315 EDT [13043:11] PANIC:  stuck spinlock detected at 
>> pg_atomic_compare_exchange_u64_impl, atomics.c:196

> I was able to reproduce this locally, with --disable-atomics, but only 
> after hacking it to fill the struct with garbage, before initializing 
> it. IOW, it failed to fail, because the spinlock happened to be 
> initialized correctly by accident. Perhaps that's happening on piculet, too.

Oh, right.  HPPA is unique among our platforms, I think, in that the
"unlocked" state of a spinlock is not "all zeroes".  So if you're dealing
with pre-zeroed memory, which shmem generally would be, failing to
initialize a spinlock does not cause visible errors except on HPPA.

I wonder whether it's sensible to have --enable-cassert have the effect
of filling memory allocated by ShmemAlloc or the DSA code with junk (as
palloc does) instead of leaving it at zeroes.  It's not modeling the
same kind of effect, since we have no shmem-freeing primitives, but
it might be useful for this sort of thing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> Attached is a quick sketch of how this could perhaps be done (ignoring
> for the moment the relatively-boring opclass pushups).  It introduces
> a new function hash_any_extended which differs from hash_any() in that
> (a) it combines both b and c into the result and (b) it accepts a seed
> which is mixed into the initial state if it's non-zero.

> Comments?

Hm.  Despite the comment at lines 302-304, I'm not sure that we ought
to do this simply by using "b" as the high order bits.  AFAICS that
exposes little or no additional randomness; in particular it seems
unlikely to meet Jenkins' original design goal that "every 1-bit and
2-bit delta achieves avalanche".  There might be some simple way to
extend the existing code to produce a mostly-independent set of 32 more
bits, but I wonder if we wouldn't be better advised to just keep Jenkins'
code as-is and use some other method entirely for producing the
32 new result bits.

... In fact, on perusing the linked-to page
http://burtleburtle.net/bob/hash/doobs.html
Bob says specifically that taking b and c from this hash does not
produce a fully random 64-bit result.  He has a new one that does,
lookup3.c, but probably he hasn't tried to make that bit-compatible
with the 1997 algorithm.

Your injection of the seed as prepended data seems unassailable from the
randomness standpoint.  But I wonder whether we could do it more cheaply
by xoring the seed into the initial a/b/c values --- it's not very clear
whether those are magic in any interesting way, or just some randomly
chosen constants.

Anyway, I'd certainly suggest that whoever embarks on this for real
spend some time perusing Jenkins' website.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Andres Freund
On 2017-08-16 17:06:42 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-08-16 12:24:11 -0400, Peter Eisentraut wrote:
> >> On 5/4/17 08:05, Magnus Hagander wrote:
> >>> PFA a patch that adds a new function, pg_move_replication_slot, that
> >>> makes it possible to move the location of a replication slot without
> >>> actually consuming all the WAL on it.
> 
> >> The name keeps confusing me.  I understand "move" to be "rename" or
> >> possibly "move it elsewhere", but not "move around in".  I understand
> >> that there is an analogy with FETCH/MOVE in cursors, but it's still
> >> confusing.
> 
> > pg_forward_replication_slot()?
> 
> If I understand what this is meant to do, maybe better
> pg_move_replication_slot_lsn() or pg_change_replication_slot_lsn() ?
> The point being that you're adjusting the LSN pointer contained
> in the slot, which is distinct from the slot itself.

I think we should constrain the API to only allow later LSNs than
currently in the slot, rather than arbitrary ones. That's why I was
thinking of "forward".  I'm not convinced it's a good / safe idea to
allow arbitrary values to be set.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Heikki Linnakangas

On 08/17/2017 12:20 AM, Tom Lane wrote:

Andres Freund  writes:

On 2017-08-16 16:20:28 +0300, Heikki Linnakangas wrote:
+   pg_atomic_write_u64(>phs_nallocated, 0);



It's not ok to initialize an atomic with pg_atomic_write_u64 - works
well enough for "plain" atomics, but the fallback implementation isn't
ok with it. You're probably going to get a failure on the respective
buildfarm animal soon.


Indeed, gaur fails with

2017-08-16 17:09:38.315 EDT [13043:11] PANIC:  stuck spinlock detected at pg_at\
omic_compare_exchange_u64_impl, atomics.c:196
2017-08-16 17:09:38.315 EDT [13043:12] STATEMENT:  select count(*) from a_star;
2017-08-16 17:09:40.350 EDT [12437:3] LOG:  server process (PID 13043) was term\
inated by signal 6
2017-08-16 17:09:40.350 EDT [12437:4] DETAIL:  Failed process was running: sele\
ct count(*) from a_star;

and I'm sure pademelon will fail once it gets to that.


Fixed.


I thought we had other buildfarm animals testing the fallback path,
though?

Yes, at least piculet is building with --disable-atomics.

I was able to reproduce this locally, with --disable-atomics, but only 
after hacking it to fill the struct with garbage, before initializing 
it. IOW, it failed to fail, because the spinlock happened to be 
initialized correctly by accident. Perhaps that's happening on piculet, too.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:38 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> After some further thought, I propose the following approach to the
>> issues raised on this thread:
>
>> 1. Allow hash functions to have a second, optional support function,
>> similar to what we did for btree opclasses in
>> c6e3ac11b60ac4a8942ab964252d51c1c0bd8845.  The second function will
>> have a signature of (opclass_datatype, int64) and should return int64.
>> The int64 argument is a salt.  When the salt is 0, the low 32 bits of
>> the return value should match what the existing hash support function
>> returns.  Otherwise, the salt should be used to perturb the hash
>> calculation.
>
> +1

Attached is a quick sketch of how this could perhaps be done (ignoring
for the moment the relatively-boring opclass pushups).  It introduces
a new function hash_any_extended which differs from hash_any() in that
(a) it combines both b and c into the result and (b) it accepts a seed
which is mixed into the initial state if it's non-zero.

Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


hash-any-extended-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] taking stdbool.h into use

2017-08-16 Thread Tom Lane
I wrote:
> gaur/pademelon isn't booted up right now, but it might provide
> an example of a system that lacks  altogether.
> (If it doesn't, I'd be willing to concede that we need not
> consider that scenario anymore.)

For the record --- pademelon (vendor cc on that box) doesn't have
 at all.  gaur (user-installed gcc) has such a header,
but it contains

typedef enum
  {
false = 0,
true = 1
  } bool;

which unsurprisingly results in
sizeof(bool) = 4

What's possibly more relevant to Peter's patch, this represents
a platform on which "#include " succeeds, but
(a) there is no typedef _Bool, and (b) "bool" is not a macro.
Obviously pre-C99 ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Andres Freund  writes:
> On 2017-08-16 16:20:28 +0300, Heikki Linnakangas wrote:
> + pg_atomic_write_u64(>phs_nallocated, 0);

> It's not ok to initialize an atomic with pg_atomic_write_u64 - works
> well enough for "plain" atomics, but the fallback implementation isn't
> ok with it. You're probably going to get a failure on the respective
> buildfarm animal soon.

Indeed, gaur fails with

2017-08-16 17:09:38.315 EDT [13043:11] PANIC:  stuck spinlock detected at pg_at\
omic_compare_exchange_u64_impl, atomics.c:196
2017-08-16 17:09:38.315 EDT [13043:12] STATEMENT:  select count(*) from a_star;
2017-08-16 17:09:40.350 EDT [12437:3] LOG:  server process (PID 13043) was term\
inated by signal 6
2017-08-16 17:09:40.350 EDT [12437:4] DETAIL:  Failed process was running: sele\
ct count(*) from a_star;

and I'm sure pademelon will fail once it gets to that.  I thought we
had other buildfarm animals testing the fallback path, though?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Tom Lane
Andres Freund  writes:
> On 2017-08-16 12:24:11 -0400, Peter Eisentraut wrote:
>> On 5/4/17 08:05, Magnus Hagander wrote:
>>> PFA a patch that adds a new function, pg_move_replication_slot, that
>>> makes it possible to move the location of a replication slot without
>>> actually consuming all the WAL on it.

>> The name keeps confusing me.  I understand "move" to be "rename" or
>> possibly "move it elsewhere", but not "move around in".  I understand
>> that there is an analogy with FETCH/MOVE in cursors, but it's still
>> confusing.

> pg_forward_replication_slot()?

If I understand what this is meant to do, maybe better
pg_move_replication_slot_lsn() or pg_change_replication_slot_lsn() ?
The point being that you're adjusting the LSN pointer contained
in the slot, which is distinct from the slot itself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra Vietnamese unaccent rules

2017-08-16 Thread Tom Lane
Dang Minh Huong  writes:
> On 2017/07/05 15:28, Michael Paquier wrote:
>> (Surprised to see that generate_unaccent_rules.py is inconsistent on
>> MacOS, runs fine on Linux).

FWIW, I got identical results from running the script on current macOS
(Sierra) and Linux (RHEL6).

>> Testing with characters having two accents, the results are produced
>> as wanted. I am attaching an updated patch with all those
>> simplifications. Thoughts?

> Thanks, so pretty. The patch is fine to me.

Pushed into v11.  I'm not really qualified to review the Python coding
style, but I did fix a typo in a comment.

BTW, while this isn't a reason to delay this patch, I wonder whether
the regression test for unaccent is really adequate.  According to
https://coverage.postgresql.org/contrib/unaccent/unaccent.c.gcov.html
it isn't doing anything to check multicharacter source strings, and
what's considerably more disturbing, it isn't exercising the PG_CATCH
code that's meant to deal with characters outside the current database's
encoding.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to move the position of a replication slot

2017-08-16 Thread Andres Freund
On 2017-08-16 12:24:11 -0400, Peter Eisentraut wrote:
> On 5/4/17 08:05, Magnus Hagander wrote:
> > PFA a patch that adds a new function, pg_move_replication_slot, that
> > makes it possible to move the location of a replication slot without
> > actually consuming all the WAL on it.
> 
> The name keeps confusing me.  I understand "move" to be "rename" or
> possibly "move it elsewhere", but not "move around in".  I understand
> that there is an analogy with FETCH/MOVE in cursors, but it's still
> confusing.

pg_forward_replication_slot()?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 16, 2017 at 3:44 PM, Tom Lane  wrote:
>> Will you fix it, or shall I?

> Whichever you like.

It's your commit, you can do the honors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 3:44 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Aug 16, 2017 at 3:02 PM, Robert Haas  wrote:
>>> On Wed, Aug 16, 2017 at 2:16 PM, Tom Lane  wrote:
 The current text of the comment dates to commit 177c56d60, and looking at
 that commit makes it pretty clear that the line I'm complaining of
 belonged to the previous text; it evidently just missed getting deleted.
>
>>> Got it.  Nice forensics, and sorry about the good.
>
>> ... goof.
>
> Will you fix it, or shall I?

Whichever you like.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 16, 2017 at 3:02 PM, Robert Haas  wrote:
>> On Wed, Aug 16, 2017 at 2:16 PM, Tom Lane  wrote:
>>> The current text of the comment dates to commit 177c56d60, and looking at
>>> that commit makes it pretty clear that the line I'm complaining of
>>> belonged to the previous text; it evidently just missed getting deleted.

>> Got it.  Nice forensics, and sorry about the good.

> ... goof.

Will you fix it, or shall I?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Heikki Linnakangas  writes:
> On 08/16/2017 09:00 PM, Tom Lane wrote:
>> I don't buy that argument.  A caller might think "Why do I need
>> shm_toc_estimate, when I can compute the *exact* size I need?".
>> And it would have worked, up till this proposed patch.

> Well, no. The size of the shm_toc struct is subtracted from the size 
> that you give to shm_toc_create. As well as the sizes of the TOC 
> entries. And those sizes are private to shm_toc.c, so a caller has no 
> way to know what size it should pass to shm_toc_create(), in order to 
> have N bytes of space actually usable. You really need to use 
> shm_toc_estimate() if you want any guarantees on what will fit.

Good point --- objection withdrawn.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] distinct estimate of a hard-coded VALUES list

2017-08-16 Thread Tom Lane
Jeff Janes  writes:
> This patch still applies, and I think the argument for it is still valid.
> So I'm going to make a commit-fest entry for it.  Is there additional
> evidence we should gather?

I think we had consensus to apply this at the start of the next
development cycle; I just forgot to do it for v10.  Hence, pushed
into v11.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
On 2017-08-16 16:20:28 +0300, Heikki Linnakangas wrote:
> On 05/06/2017 04:57 PM, David Rowley wrote:
> > Andres mentioned in [2] that it might be worth exploring using atomics
> > to do the same job. So I went ahead and did that, and came up with the
> > attached, which is a slight variation on what he mentioned in the
> > thread.
> > 
> > To keep things a bit more simple, and streamline, I ended up pulling
> > out the logic for setting the startblock into another function, which
> > we only call once before the first call to
> > heap_parallelscan_nextpage().  I also ended up changing phs_cblock and
> > replacing it with a counter that always starts at zero. The actual
> > block is calculated based on that + the startblock modulo nblocks.
> > This makes things a good bit more simple for detecting when we've
> > allocated all the blocks to the workers, and also works nicely when
> > wrapping back to the start of a relation when we started somewhere in
> > the middle due to piggybacking with a synchronous scan.

> Looks reasonable. I edited the comments and the variable names a bit, to my
> liking, and committed. Thanks!

Brief post-commit review:

+* phs_nallocated tracks how many pages have been allocated to workers
+* already.  When phs_nallocated >= rs_nblocks, all blocks have been
+* allocated.

allocated seems a bit of a confusing terminology.


@@ -1635,8 +1637,8 @@ heap_parallelscan_initialize(ParallelHeapScanDesc target, 
Relation relation,
!RelationUsesLocalBuffers(relation) &&
target->phs_nblocks > NBuffers / 4;
SpinLockInit(>phs_mutex);
-   target->phs_cblock = InvalidBlockNumber;
target->phs_startblock = InvalidBlockNumber;
+   pg_atomic_write_u64(>phs_nallocated, 0);
SerializeSnapshot(snapshot, target->phs_snapshot_data);
 }

It's not ok to initialize an atomic with pg_atomic_write_u64 - works
well enough for "plain" atomics, but the fallback implementation isn't
ok with it. You're probably going to get a failure on the respective
buildfarm animal soon.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
On 2017-08-16 14:09:08 -0400, Tom Lane wrote:
> >> I'm not sure that that's good enough, and I'm damn sure that it
> >> shouldn't be undocumented.
> 
> > 8 byte alignment would be good enough, so BUFFERALIGN ought to be
> > sufficient. But it'd be nicer to have a separate more descriptive knob.
> 
> What I meant by possibly not good enough is that pg_atomic_uint64 used
> in other places isn't going to be very safe.

Well, it's not used otherwise in core so far, leaving test code
aside. It's correctly aligned if part of a aligned struct - the atomics
code itself can't really do anything about aligning that struct itself
isn't aligned.


> We might be effectively all right as long as we have a coding rule that
> pg_atomic_uint64 can only be placed in memory handed out by ShmemAlloc
> or shm_toc_allocate, which both have bigger-than-MAXALIGN alignment
> practices.  But this needs to be documented.

Well, one could argue the alignment checks in every function are that
:). But yea, we probably should mention it more than that.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 3:02 PM, Robert Haas  wrote:
> On Wed, Aug 16, 2017 at 2:16 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> On Wed, Aug 16, 2017 at 12:31 PM, Tom Lane  wrote:
 -->  * reconstruct the row for EvalPlanQual(). Find an alternative local 
 path
 Should the marked line simply be deleted?  If not, what correction is
 appropriate?
>>
>>> Hmm, wow.  My first thought was that it should just say
>>> "reconstructing" rather than "reconstruct", but on further reading I
>>> think you might have the right idea.
>>
>> The current text of the comment dates to commit 177c56d60, and looking at
>> that commit makes it pretty clear that the line I'm complaining of
>> belonged to the previous text; it evidently just missed getting deleted.
>
> Got it.  Nice forensics, and sorry about the good.

... goof.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 2:16 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Aug 16, 2017 at 12:31 PM, Tom Lane  wrote:
>>> -->  * reconstruct the row for EvalPlanQual(). Find an alternative local 
>>> path
>>> Should the marked line simply be deleted?  If not, what correction is
>>> appropriate?
>
>> Hmm, wow.  My first thought was that it should just say
>> "reconstructing" rather than "reconstruct", but on further reading I
>> think you might have the right idea.
>
> The current text of the comment dates to commit 177c56d60, and looking at
> that commit makes it pretty clear that the line I'm complaining of
> belonged to the previous text; it evidently just missed getting deleted.

Got it.  Nice forensics, and sorry about the good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Heikki Linnakangas

On 08/16/2017 09:00 PM, Tom Lane wrote:

Robert Haas  writes:

On Wed, Aug 16, 2017 at 1:44 PM, Tom Lane  wrote:

I was feeling a bit uncomfortable with the BUFFERALIGN_DOWN() for a
different reason: if the caller has specified the exact amount of space it
needs, having shm_toc_create discard some could lead to an unexpected
failure.



Well, that's why Heikki also patched shm_toc_estimate.  With the
patch, if the size being used in shm_toc_create comes from
shm_toc_estimate, it will always be aligned and nothing bad will
happen.


Sure.  So the point is entirely about what should happen if someone
doesn't use shm_toc_estimate.


If the user invents another size out of whole cloth, then
they might get a few bytes less than they expect, but that's what you
get for not using shm_toc_estimate().


I don't buy that argument.  A caller might think "Why do I need
shm_toc_estimate, when I can compute the *exact* size I need?".
And it would have worked, up till this proposed patch.


Well, no. The size of the shm_toc struct is subtracted from the size 
that you give to shm_toc_create. As well as the sizes of the TOC 
entries. And those sizes are private to shm_toc.c, so a caller has no 
way to know what size it should pass to shm_toc_create(), in order to 
have N bytes of space actually usable. You really need to use 
shm_toc_estimate() if you want any guarantees on what will fit.


I've pushed the patch to fix this, with some extra comments and 
reformatting shm_toc_estimate.



8 byte alignment would be good enough, so BUFFERALIGN ought to be
sufficient. But it'd be nicer to have a separate more descriptive knob.


What I meant by possibly not good enough is that pg_atomic_uint64 used
in other places isn't going to be very safe.

We might be effectively all right as long as we have a coding rule that
pg_atomic_uint64 can only be placed in memory handed out by ShmemAlloc
or shm_toc_allocate, which both have bigger-than-MAXALIGN alignment
practices.  But this needs to be documented.


Yeah. We are implicitly also relying on ShmemAlloc() to return 
sufficiently-aligned memory. Palloc() too, although you probably 
wouldn't use atomic ops on a palloc'd struct. I think we should 
introduce a new ALIGNOF macro for that, and document that those 
functions return memory with enough alignment. GENUINEMAX_ALIGNOF? 
MAXSTRUCT_ALIGNOF?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken link to DocBook XSL Stylesheets

2017-08-16 Thread Peter Eisentraut
On 6/20/17 22:20, Masahiko Sawada wrote:
> Hi,
> 
> In J.2. Tool Sets section of documentation, there is a link to DocBook
> XSL Stylesheets but that link seems no longer available. I got 404
> error.
> 
> J.2. Tool Sets
> 
> 
> As a updated next link I guess we can use the following link.
> 

Fixed.

(I only backpatched this to PG10, because with older versions you then
get to the issue that the DSSSL stylesheets don't appear to have any
website at all anymore.  So better leave all of that rot consistently. ;-) )

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication: stuck spinlock at ReplicationSlotRelease

2017-08-16 Thread Peter Eisentraut
On 6/23/17 20:58, Peter Eisentraut wrote:
> On 6/23/17 16:15, Andres Freund wrote:
>> On 2017-06-23 13:26:58 -0400, Alvaro Herrera wrote:
>>> Hmm, so for instance in LogicalIncreaseRestartDecodingForSlot() we have
>>> some elog(DEBUG1) calls with the slot spinlock held.  That's probably
>>> uncool.
>>
>> It definitely is not cool, rather daft even (it's probably me who wrote
>> that).  No idea why I've done it like that, makes no sense.
> 
> Do you want to take a look at move those elog calls around a bit?  That
> should do it.

This hasn't been fixed yet.  How should we do it?  Just move the
spinlock release before the elog() calls, thus risking reading some data
unprotected, or copy the necessary things into local variables?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 16, 2017 at 12:31 PM, Tom Lane  wrote:
>> -->  * reconstruct the row for EvalPlanQual(). Find an alternative local path
>> Should the marked line simply be deleted?  If not, what correction is
>> appropriate?

> Hmm, wow.  My first thought was that it should just say
> "reconstructing" rather than "reconstruct", but on further reading I
> think you might have the right idea.

The current text of the comment dates to commit 177c56d60, and looking at
that commit makes it pretty clear that the line I'm complaining of
belonged to the previous text; it evidently just missed getting deleted.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Andres Freund  writes:
> On 2017-08-16 13:40:09 -0400, Tom Lane wrote:
>> I was wondering why the shm_toc code was using BUFFERALIGN and not
>> MAXALIGN, and I now suspect that the answer is "it's an entirely
>> undocumented kluge to make the atomics code not crash on 32-bit
>> machines, so long as nobody puts a pg_atomic_uint64 anywhere except in
>> a shm_toc".

> I don't think there were any atomics in affected code until earlier
> today... And given it didn't work for shm_toc anyway, I'm not quite
> following.

Right, Robert pointed out that it's pre-existing code.  My point should
be read as "it's just blind luck that shm_toc is using bigger than
MAXALIGN alignment, or this would never work on 32-bit machines".

>> I'm not sure that that's good enough, and I'm damn sure that it
>> shouldn't be undocumented.

> 8 byte alignment would be good enough, so BUFFERALIGN ought to be
> sufficient. But it'd be nicer to have a separate more descriptive knob.

What I meant by possibly not good enough is that pg_atomic_uint64 used
in other places isn't going to be very safe.

We might be effectively all right as long as we have a coding rule that
pg_atomic_uint64 can only be placed in memory handed out by ShmemAlloc
or shm_toc_allocate, which both have bigger-than-MAXALIGN alignment
practices.  But this needs to be documented.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10 beta docs: different replication solutions

2017-08-16 Thread Peter Eisentraut
On 7/30/17 21:34, Steve Singer wrote:
> We don't seem to describe logical replication on
> 
> https://www.postgresql.org/docs/10/static/different-replication-solutions.html
> 
> The attached patch adds a section.

Committed with some further tweaking, thanks!

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 16, 2017 at 1:44 PM, Tom Lane  wrote:
>> I was feeling a bit uncomfortable with the BUFFERALIGN_DOWN() for a
>> different reason: if the caller has specified the exact amount of space it
>> needs, having shm_toc_create discard some could lead to an unexpected
>> failure.

> Well, that's why Heikki also patched shm_toc_estimate.  With the
> patch, if the size being used in shm_toc_create comes from
> shm_toc_estimate, it will always be aligned and nothing bad will
> happen.

Sure.  So the point is entirely about what should happen if someone
doesn't use shm_toc_estimate.

> If the user invents another size out of whole cloth, then
> they might get a few bytes less than they expect, but that's what you
> get for not using shm_toc_estimate().

I don't buy that argument.  A caller might think "Why do I need
shm_toc_estimate, when I can compute the *exact* size I need?".
And it would have worked, up till this proposed patch.

I think the new API spec for such cases ought to be "if you supply an
unaligned size, we'll error out", not "if you supply an unaligned size,
we'll silently lose some of it".  The former is going to behave a lot
more predictably.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund


On August 16, 2017 10:47:23 AM PDT, Robert Haas  wrote:
>On Wed, Aug 16, 2017 at 1:40 PM, Tom Lane  wrote:
>> I was wondering why the shm_toc code was using BUFFERALIGN and not
>> MAXALIGN, and I now suspect that the answer is "it's an entirely
>> undocumented kluge to make the atomics code not crash on 32-bit
>> machines, so long as nobody puts a pg_atomic_uint64 anywhere except
>> in a shm_toc".
>
>Well, shm_toc considerably predates 64-bit atomics, so I think the
>causality cannot run in that direction.  shm_toc.c first appeared in
>the tree in January of 2014.  src/include/port/atomics didn't show up
>until September of that year, and 64-bit atomics weren't actually
>usable in practice until e8fdbd58fe564a29977f4331cd26f9697d76fc40 in
>April of 2017.

Well, not for core code.  I certainly know about production code using it, 
because crusty platforms are considered irrelevant...

Independent of that, a comment explaining what the BUFFERALIGN is intending 
would be good.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 1:44 PM, Tom Lane  wrote:
> I was feeling a bit uncomfortable with the BUFFERALIGN_DOWN() for a
> different reason: if the caller has specified the exact amount of space it
> needs, having shm_toc_create discard some could lead to an unexpected
> failure.

Well, that's why Heikki also patched shm_toc_estimate.  With the
patch, if the size being used in shm_toc_create comes from
shm_toc_estimate, it will always be aligned and nothing bad will
happen.  If the user invents another size out of whole cloth, then
they might get a few bytes less than they expect, but that's what you
get for not using shm_toc_estimate().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
On 2017-08-16 13:44:28 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Don't think we require BUFFERALIGN - MAXALIGN ought to be
> > sufficient.
> 
> Uh, see my other message just now.

Yup, you're right.


> > The use of BUFFERALIGN presumably is to space out things
> > into different cachelines, but that doesn't really seem to be important
> > with this.  Then we can just avoid defining the new macro...
> 
> I was feeling a bit uncomfortable with the BUFFERALIGN_DOWN() for a
> different reason: if the caller has specified the exact amount of space it
> needs, having shm_toc_create discard some could lead to an unexpected
> failure.

Well, that's why shm_toc_estimate() increases the size appropriately.


> I wonder whether maybe shm_toc_create should just error out if the
> number it's handed isn't aligned already.

I think that's going to be harder atm, because it's not the size shm_toc
computes, it's what the caller to shm_toc_estimate_chunk() provides.
And that size is already guaranteed to be upsized by BUFFERALIGN in
shm_toc_estimate_chunk().  It's just that the base-offset from where the
allocations start isn't aligned.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 1:40 PM, Tom Lane  wrote:
> I was wondering why the shm_toc code was using BUFFERALIGN and not
> MAXALIGN, and I now suspect that the answer is "it's an entirely
> undocumented kluge to make the atomics code not crash on 32-bit
> machines, so long as nobody puts a pg_atomic_uint64 anywhere except
> in a shm_toc".

Well, shm_toc considerably predates 64-bit atomics, so I think the
causality cannot run in that direction.  shm_toc.c first appeared in
the tree in January of 2014.  src/include/port/atomics didn't show up
until September of that year, and 64-bit atomics weren't actually
usable in practice until e8fdbd58fe564a29977f4331cd26f9697d76fc40 in
April of 2017.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
On 2017-08-16 13:40:09 -0400, Tom Lane wrote:
> I wrote:
> > I can confirm that on dromedary, that regression test case is attempting
> > to create a TOC with a not-well-aligned size: 93268 = 0x16c54 bytes.
> 
> ... although, on closer look, it still seems like we have a fundamental
> bit of schizophrenia here, because on this machine
> 
> $ grep ALIGN pg_config.h
> #define ALIGNOF_DOUBLE 4
> #define ALIGNOF_INT 4
> #define ALIGNOF_LONG 4
> #define ALIGNOF_LONG_LONG_INT 4
> #define ALIGNOF_SHORT 2
> #define MAXIMUM_ALIGNOF 4
> 
> Basically, therefore, ISTM that it is not a good thing that the atomics
> code thinks it can rely on 8-byte-aligned data when the entire rest of
> the system believes that 4-byte alignment is enough for anything.

That's a hardware requirement, we can't do much about it. Several
[micro-]architectures don't support unaligned atomic 8 byte writes.


> I was wondering why the shm_toc code was using BUFFERALIGN and not
> MAXALIGN, and I now suspect that the answer is "it's an entirely
> undocumented kluge to make the atomics code not crash on 32-bit
> machines, so long as nobody puts a pg_atomic_uint64 anywhere except in
> a shm_toc".

I don't think there were any atomics in affected code until earlier
today... And given it didn't work for shm_toc anyway, I'm not quite
following.


> I'm not sure that that's good enough, and I'm damn sure that it
> shouldn't be undocumented.

8 byte alignment would be good enough, so BUFFERALIGN ought to be
sufficient. But it'd be nicer to have a separate more descriptive knob.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Andres Freund  writes:
> Don't think we require BUFFERALIGN - MAXALIGN ought to be
> sufficient.

Uh, see my other message just now.

> The use of BUFFERALIGN presumably is to space out things
> into different cachelines, but that doesn't really seem to be important
> with this.  Then we can just avoid defining the new macro...

I was feeling a bit uncomfortable with the BUFFERALIGN_DOWN() for a
different reason: if the caller has specified the exact amount of space it
needs, having shm_toc_create discard some could lead to an unexpected
failure.  I wonder whether maybe shm_toc_create should just error out if
the number it's handed isn't aligned already.

>> +return BUFFERALIGN(
>> +add_size(offsetof(shm_toc, toc_entry),
>> + add_size(mul_size(e->number_of_keys, 
>> sizeof(shm_toc_entry)),
>> +  e->space_for_chunks)));

> I think splitting this into separate statements would be better.

+1, it was too complicated already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
I wrote:
> I can confirm that on dromedary, that regression test case is attempting
> to create a TOC with a not-well-aligned size: 93268 = 0x16c54 bytes.

... although, on closer look, it still seems like we have a fundamental
bit of schizophrenia here, because on this machine

$ grep ALIGN pg_config.h
#define ALIGNOF_DOUBLE 4
#define ALIGNOF_INT 4
#define ALIGNOF_LONG 4
#define ALIGNOF_LONG_LONG_INT 4
#define ALIGNOF_SHORT 2
#define MAXIMUM_ALIGNOF 4

Basically, therefore, ISTM that it is not a good thing that the atomics
code thinks it can rely on 8-byte-aligned data when the entire rest of
the system believes that 4-byte alignment is enough for anything.

I was wondering why the shm_toc code was using BUFFERALIGN and not
MAXALIGN, and I now suspect that the answer is "it's an entirely
undocumented kluge to make the atomics code not crash on 32-bit
machines, so long as nobody puts a pg_atomic_uint64 anywhere except
in a shm_toc".

I'm not sure that that's good enough, and I'm damn sure that it
shouldn't be undocumented.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Tom Lane
Heikki Linnakangas  writes:
> On 08/16/2017 08:10 PM, Andres Freund wrote:
>> Seems like we'd just have to add alignment of the total size to
>> shm_toc_estimate()?

> Yeah, that's the gist of it.

I can confirm that on dromedary, that regression test case is attempting
to create a TOC with a not-well-aligned size: 93268 = 0x16c54 bytes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
Hi,

On 2017-08-16 20:19:29 +0300, Heikki Linnakangas wrote:
> diff --git a/src/backend/storage/ipc/shm_toc.c 
> b/src/backend/storage/ipc/shm_toc.c
> index 9f259441f0..121d5a1ec9 100644
> --- a/src/backend/storage/ipc/shm_toc.c
> +++ b/src/backend/storage/ipc/shm_toc.c
> @@ -44,7 +44,7 @@ shm_toc_create(uint64 magic, void *address, Size nbytes)
>   Assert(nbytes > offsetof(shm_toc, toc_entry));
>   toc->toc_magic = magic;
>   SpinLockInit(>toc_mutex);
> - toc->toc_total_bytes = nbytes;
> + toc->toc_total_bytes = BUFFERALIGN_DOWN(nbytes);
>   toc->toc_allocated_bytes = 0;
>   toc->toc_nentry = 0;

Don't think we require BUFFERALIGN - MAXALIGN ought to be
sufficient. The use of BUFFERALIGN presumably is to space out things
into different cachelines, but that doesn't really seem to be important
with this.  Then we can just avoid defining the new macro...


> @@ -252,7 +252,8 @@ shm_toc_lookup(shm_toc *toc, uint64 key, bool noError)
>  Size
>  shm_toc_estimate(shm_toc_estimator *e)
>  {
> - return add_size(offsetof(shm_toc, toc_entry),
> - add_size(mul_size(e->number_of_keys, 
> sizeof(shm_toc_entry)),
> -  e->space_for_chunks));
> + return BUFFERALIGN(
> + add_size(offsetof(shm_toc, toc_entry),
> +  add_size(mul_size(e->number_of_keys, 
> sizeof(shm_toc_entry)),
> +   e->space_for_chunks)));
>  }

I think splitting this into separate statements would be better. I think
we also should rather *ALLIGN the size without e->space_for_chunks. The
latter is already aligned properly, and the important bit is that we
have enough space for e->space_for_chunks afterwards, and only padding
is in th eway of that...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 1:19 PM, Heikki Linnakangas  wrote:
> Yeah, that's the gist of it.
>
> The attached patch seems to fix this. I'm not too familiar with this DSM
> stuff, but this seems right to me. Unless someone has a better idea soon,
> I'll commit this to make the buildfarm happy.

Mmm, clever.  Yeah, that looks reasonable at first glance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Garbled comment in postgresGetForeignJoinPaths

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:31 PM, Tom Lane  wrote:
> postgres_fdw.c around line 4500:
>
> /*
>  * If there is a possibility that EvalPlanQual will be executed, we need
>  * to be able to reconstruct the row using scans of the base relations.
>  * GetExistingLocalJoinPath will find a suitable path for this purpose in
>  * the path list of the joinrel, if one exists.  We must be careful to
>  * call it before adding any ForeignPath, since the ForeignPath might
>  * dominate the only suitable local path available.  We also do it before
> -->  * reconstruct the row for EvalPlanQual(). Find an alternative local path
>  * calling foreign_join_ok(), since that function updates fpinfo and marks
>  * it as pushable if the join is found to be pushable.
>  */
>
> Should the marked line simply be deleted?  If not, what correction is
> appropriate?

Hmm, wow.  My first thought was that it should just say
"reconstructing" rather than "reconstruct", but on further reading I
think you might have the right idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Heikki Linnakangas

On 08/16/2017 08:10 PM, Andres Freund wrote:

Afaict shm_create/shm_toc_allocate don't actually guarantee that the end
of the toc's memory is suitably aligned.  But I didn't yet have any
coffee, so ...


Robert, I'm not quite sure what the intended behaviour of shm_toc is wrt
alignment. I see that individual chunks are BUFFERALIGNed (both during
estimation, and allocation). But I don't see how the size of the entire
toc is aligned, which seems a requirement, given we allocate from the
end.
Seems like we'd just have to add alignment of the total size to
shm_toc_estimate()?


Yeah, that's the gist of it.

The attached patch seems to fix this. I'm not too familiar with this DSM 
stuff, but this seems right to me. Unless someone has a better idea 
soon, I'll commit this to make the buildfarm happy.


- Heikki
diff --git a/src/backend/storage/ipc/shm_toc.c b/src/backend/storage/ipc/shm_toc.c
index 9f259441f0..121d5a1ec9 100644
--- a/src/backend/storage/ipc/shm_toc.c
+++ b/src/backend/storage/ipc/shm_toc.c
@@ -44,7 +44,7 @@ shm_toc_create(uint64 magic, void *address, Size nbytes)
 	Assert(nbytes > offsetof(shm_toc, toc_entry));
 	toc->toc_magic = magic;
 	SpinLockInit(>toc_mutex);
-	toc->toc_total_bytes = nbytes;
+	toc->toc_total_bytes = BUFFERALIGN_DOWN(nbytes);
 	toc->toc_allocated_bytes = 0;
 	toc->toc_nentry = 0;
 
@@ -252,7 +252,8 @@ shm_toc_lookup(shm_toc *toc, uint64 key, bool noError)
 Size
 shm_toc_estimate(shm_toc_estimator *e)
 {
-	return add_size(offsetof(shm_toc, toc_entry),
-	add_size(mul_size(e->number_of_keys, sizeof(shm_toc_entry)),
-			 e->space_for_chunks));
+	return BUFFERALIGN(
+		add_size(offsetof(shm_toc, toc_entry),
+ add_size(mul_size(e->number_of_keys, sizeof(shm_toc_entry)),
+		  e->space_for_chunks)));
 }
diff --git a/src/include/c.h b/src/include/c.h
index 9066e3c578..af799dc1df 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -598,6 +598,7 @@ typedef NameData *Name;
 #define LONGALIGN_DOWN(LEN)		TYPEALIGN_DOWN(ALIGNOF_LONG, (LEN))
 #define DOUBLEALIGN_DOWN(LEN)	TYPEALIGN_DOWN(ALIGNOF_DOUBLE, (LEN))
 #define MAXALIGN_DOWN(LEN)		TYPEALIGN_DOWN(MAXIMUM_ALIGNOF, (LEN))
+#define BUFFERALIGN_DOWN(LEN)	TYPEALIGN_DOWN(ALIGNOF_BUFFER, (LEN))
 
 /*
  * The above macros will not work with types wider than uintptr_t, like with

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:38 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> After some further thought, I propose the following approach to the
>> issues raised on this thread:
>
>> 1. Allow hash functions to have a second, optional support function,
>> similar to what we did for btree opclasses in
>> c6e3ac11b60ac4a8942ab964252d51c1c0bd8845.  The second function will
>> have a signature of (opclass_datatype, int64) and should return int64.
>> The int64 argument is a salt.  When the salt is 0, the low 32 bits of
>> the return value should match what the existing hash support function
>> returns.  Otherwise, the salt should be used to perturb the hash
>> calculation.
>
> +1

Cool.

>> 2. Introduce a new hash opfamilies here which are more faster, more
>> portable, and/or better in other ways than the ones we have today.
>
> This part seems, uh, under-defined and/or over-ambitious and/or unrelated
> to the problem at hand.  What are the concrete goals?

In my view, it's for the person who proposed a new opclass to say what
goals they're trying to satisfy with that opclass.  A variety of goals
that could justify a new opclass have been proposed upthread --
especially in Jeff Davis's original post (q.v.).  Such goals could
include endian-ness independence, collation-independence, speed,
better bit-mixing, and opfamilies that span more types than our
currently ones.  These goals are probably mutually exclusive, in the
sense that endian-ness independence and collation-independence are
probably pulling in the exact opposite direction as speed, so
conceivably there could be multiple opclasses proposed with different
trade-offs.  I take no position for the present on which of those
would be worth accepting into core.

I agree with you that all of this is basically unrelated to the
problem at hand, if "the problem at hand" means "hash partitioning".
In my mind, there are two really serious issues that have been raised
on that front.  One is the problem of hash joins/aggregates/indexes on
hash-partitioned tables coming out lopsided, and adding an optional
salt will let us fix that problem.  The other is that if you migrate
your data to a different encoding or endianness, you might have
dump/reload problems, but IMHO the already-committed patch for
--load-via-partition-root is as much as really *needs* to be done
there.  I am less skeptical about the idea of endianness-independent
hash functions than he is, but "we can't have
$FEATURE_MANY_PEOPLE_WANT until we solve
$PROBLEM_ANDRES_THINKS_IS_NOT_PRACTICALLY_SOLVABLE" is not a route to
swift victory.

In short, I'm proposing to add a method to seed the existing hash
functions and get 64 bits out of them, and leaving any other potential
improvements to someone who wants to argue for them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Orphaned files in base/[oid]

2017-08-16 Thread Andres Freund
On 2017-08-16 14:20:02 +0200, Chris Travers wrote:
> So having throught about this a bit more, and having had some real-world
> experience with the script now, I have an idea that might work and some
> questions to make it succeed.
> 
> My thinking is to add a new form of vacuum called VACUUM FSCK.
> 
> This would:
> 1. lock pg_class in exclusive mode (or do I need exclusive access?), as
> this is needed to solve the race conditions.  As I see, this seems to bring
> the database to a screeching halt concurrency-wise (but unlike my script
> would allow other databases to be accessed normally).
> 2. read the files where the name consists of only digits out of the
> filesystem and compare with oids from pg_class and relfilenodes
> 3.  Any file not found in that list would then unlink it, as well as any
> files with the patter followed by an underscore or period.
> 
> This would mean that the following cases would not be handled:
> 
> If you have the first extent gone but later extents are present we check on
> the first extant, and so would not see the later ones.  Same goes for
> visibility maps and other helper files.
> 
> If you add a file in the directory which has a name like 34F3A222BC, that
> would never get cleaned up because it contains non-digits.
> 
> So this leads to the following questions:
> 
> 1.  Is locking pg_class enough to avoid race conditions?  Is exclusive mode
> sufficient or do I need exclusive access mode?
> 2.  would it be preferable to move the file to a directory rather than
> unlinking it?
> 3.  Should I perform any sort of check on the tables at the end to make
> sure everything is ok?

I think this entirely is the wrong approach.  We shouldn't add weird
check commands that require locks on pg_class, we should avoid leaving
the orphaned files in the first place.  I've upthread outlined
approached how to do so.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
On 2017-08-16 09:57:35 -0700, Peter Geoghegan wrote:
> On Wed, Aug 16, 2017 at 9:55 AM, Andres Freund  wrote:
> > On 2017-08-16 11:16:58 -0400, Tom Lane wrote:
> >> Heikki Linnakangas  writes:
> >> > A couple of 32-bit x86 buildfarm members don't seem to be happy with
> >> > this. I'll investigate, but if anyone has a clue, I'm all ears...
> >>
> >> dromedary's issue seems to be alignment:
> >>
> >> TRAP: UnalignedPointer("(((uintptr_t) ((uintptr_t)(ptr)) + ((8) - 1)) & 
> >> ~((uintptr_t) ((8) - 1))) != (uintptr_t)(ptr)", File: 
> >> "../../../../src/include/port/atomics.h", Line: 452)
> >> 2017-08-16 11:11:38.558 EDT [75693:3] LOG:  server process (PID 76277) was 
> >> terminated by signal 6: Abort trap
> >> 2017-08-16 11:11:38.558 EDT [75693:4] DETAIL:  Failed process was running: 
> >> select count(*) from a_star;
> >>
> >> Not sure if this is your bug or if it's exposing a pre-existing
> >> deficiency in the atomics code, viz, failure to ensure that
> >> pg_atomic_uint64 is actually a 64-bit-aligned type.  Andres?
> >
> > I suspect it's the former.  Suspect that the shared memory that holds
> > the "parallel desc" isn't properly aligned:
> 
> Clang has an -fsanitize=alignment option that may be useful here.

Given that we're crashing in an assert that does nothing but check for
alignment, before the memory is actually used in a "dangerous" manner, I
don't quite see how?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
Robert, Tom,


On 2017-08-16 09:55:15 -0700, Andres Freund wrote:
> > Not sure if this is your bug or if it's exposing a pre-existing
> > deficiency in the atomics code, viz, failure to ensure that
> > pg_atomic_uint64 is actually a 64-bit-aligned type.  Andres?

> I suspect it's the former.  Suspect that the shared memory that holds
> the "parallel desc" isn't properly aligned:

Or, well, a mixture of both, because it seems like a deficiency in the
shm_toc, code, rather than the atomics code.


> Afaict shm_create/shm_toc_allocate don't actually guarantee that the end
> of the toc's memory is suitably aligned.  But I didn't yet have any
> coffee, so ...

Robert, I'm not quite sure what the intended behaviour of shm_toc is wrt
alignment. I see that individual chunks are BUFFERALIGNed (both during
estimation, and allocation). But I don't see how the size of the entire
toc is aligned, which seems a requirement, given we allocate from the
end.
Seems like we'd just have to add alignment of the total size to
shm_toc_estimate()?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Simplify plpgsql's check for simple expressions.

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> You want to push something, or should I do it?

Go for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Peter Geoghegan
On Wed, Aug 16, 2017 at 9:55 AM, Andres Freund  wrote:
> On 2017-08-16 11:16:58 -0400, Tom Lane wrote:
>> Heikki Linnakangas  writes:
>> > A couple of 32-bit x86 buildfarm members don't seem to be happy with
>> > this. I'll investigate, but if anyone has a clue, I'm all ears...
>>
>> dromedary's issue seems to be alignment:
>>
>> TRAP: UnalignedPointer("(((uintptr_t) ((uintptr_t)(ptr)) + ((8) - 1)) & 
>> ~((uintptr_t) ((8) - 1))) != (uintptr_t)(ptr)", File: 
>> "../../../../src/include/port/atomics.h", Line: 452)
>> 2017-08-16 11:11:38.558 EDT [75693:3] LOG:  server process (PID 76277) was 
>> terminated by signal 6: Abort trap
>> 2017-08-16 11:11:38.558 EDT [75693:4] DETAIL:  Failed process was running: 
>> select count(*) from a_star;
>>
>> Not sure if this is your bug or if it's exposing a pre-existing
>> deficiency in the atomics code, viz, failure to ensure that
>> pg_atomic_uint64 is actually a 64-bit-aligned type.  Andres?
>
> I suspect it's the former.  Suspect that the shared memory that holds
> the "parallel desc" isn't properly aligned:

Clang has an -fsanitize=alignment option that may be useful here.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics for heap_parallelscan_nextpage()

2017-08-16 Thread Andres Freund
On 2017-08-16 11:16:58 -0400, Tom Lane wrote:
> Heikki Linnakangas  writes:
> > A couple of 32-bit x86 buildfarm members don't seem to be happy with 
> > this. I'll investigate, but if anyone has a clue, I'm all ears...
> 
> dromedary's issue seems to be alignment:
> 
> TRAP: UnalignedPointer("(((uintptr_t) ((uintptr_t)(ptr)) + ((8) - 1)) & 
> ~((uintptr_t) ((8) - 1))) != (uintptr_t)(ptr)", File: 
> "../../../../src/include/port/atomics.h", Line: 452)
> 2017-08-16 11:11:38.558 EDT [75693:3] LOG:  server process (PID 76277) was 
> terminated by signal 6: Abort trap
> 2017-08-16 11:11:38.558 EDT [75693:4] DETAIL:  Failed process was running: 
> select count(*) from a_star;
> 
> Not sure if this is your bug or if it's exposing a pre-existing
> deficiency in the atomics code, viz, failure to ensure that
> pg_atomic_uint64 is actually a 64-bit-aligned type.  Andres?

I suspect it's the former.  Suspect that the shared memory that holds
the "parallel desc" isn't properly aligned:

void
ExecSeqScanInitializeDSM(SeqScanState *node,
 ParallelContext *pcxt)
{
...
pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
heap_parallelscan_initialize(pscan,
 
node->ss.ss_currentRelation,
 
estate->es_snapshot);


/*
...
 * We allocate backwards from the end of the segment, so that the TOC entries
 * can grow forward from the start of the segment.
 */
extern void *
shm_toc_allocate(shm_toc *toc, Size nbytes)
{
volatile shm_toc *vtoc = toc;
Sizetotal_bytes;
Sizeallocated_bytes;
Sizenentry;
Sizetoc_bytes;

/* Make sure request is well-aligned. */
nbytes = BUFFERALIGN(nbytes);
...
return ((char *) toc) + (total_bytes - allocated_bytes - nbytes);
}


/*
 * Initialize a region of shared memory with a table of contents.
 */
shm_toc *
shm_toc_create(uint64 magic, void *address, Size nbytes)
{
shm_toc*toc = (shm_toc *) address;

Assert(nbytes > offsetof(shm_toc, toc_entry));
toc->toc_magic = magic;
SpinLockInit(>toc_mutex);
toc->toc_total_bytes = nbytes;
toc->toc_allocated_bytes = 0;
toc->toc_nentry = 0;

return toc;
}

Afaict shm_create/shm_toc_allocate don't actually guarantee that the end
of the toc's memory is suitably aligned.  But I didn't yet have any
coffee, so ...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Simplify plpgsql's check for simple expressions.

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:49 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Aug 16, 2017 at 11:56 AM, Tom Lane  wrote:
>>> I can get on board with that statement.  Can you draft a better wording?
>
>> Here is an attempt.  Feel free to edit.
>
> I think s/plan/query/ in the last bit would be better.  Perhaps
>
> +* However, if force_parallel_mode = on or force_parallel_mode = 
> regress,
> +* then we impose parallel mode whenever it's safe to do so, even if 
> the
> +* final plan doesn't use parallelism.  It's not safe to do so if the 
> query
> +* contains anything parallel-unsafe; parallelModeOK will be false in 
> that
> +* case.  Otherwise, everything in the query is either parallel-safe 
> or
> +* parallel-restricted, and in either case it should be OK to impose
> +* parallel-mode restrictions.  If that ends up breaking something, 
> then
> +* either some function the user included in the query is incorrectly
> +* labelled as parallel-safe or parallel-restricted when in reality 
> it's
> +* parallel-unsafe, or else the query planner itself has a bug.
>  */

Works for me.  I'm happy to phrase this in any way that makes it clear
to you, 'cuz it's already clear to me.  :-)

You want to push something, or should I do it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Simplify plpgsql's check for simple expressions.

2017-08-16 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 16, 2017 at 11:56 AM, Tom Lane  wrote:
>> I can get on board with that statement.  Can you draft a better wording?

> Here is an attempt.  Feel free to edit.

I think s/plan/query/ in the last bit would be better.  Perhaps

+* However, if force_parallel_mode = on or force_parallel_mode = 
regress,
+* then we impose parallel mode whenever it's safe to do so, even if the
+* final plan doesn't use parallelism.  It's not safe to do so if the 
query
+* contains anything parallel-unsafe; parallelModeOK will be false in 
that
+* case.  Otherwise, everything in the query is either parallel-safe or
+* parallel-restricted, and in either case it should be OK to impose
+* parallel-mode restrictions.  If that ends up breaking something, then
+* either some function the user included in the query is incorrectly
+* labelled as parallel-safe or parallel-restricted when in reality it's
+* parallel-unsafe, or else the query planner itself has a bug.
 */

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >