Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Joshua D. Drake



Wow, 1960s feminazis, eh? I originally thought you were just a narrow
minded, pedantic and antiquated grammarian. Now I realize that's the least
of your troubles. Please take your misogyny elsewhere. I hear the Rabid
Puppies have openings.


The term feminazi has zero business in this community.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 9:20 AM, Andres Freund  wrote:
> On 2015-09-21 16:36:03 +0200, Andres Freund wrote:
>> Agreed. I'll update the patch.
>
> Here's updated patches against master. These include the "legacy"
> truncation support. There's no meaningful functional differences in this
> version except addressing the review comments that I agreed with, and a
> fair amount of additional polishing.

0002 looks fine.

Regarding 0003, I'm still very much not convinced that it's a good
idea to apply this to 9.3 and 9.4.  This patch changes the way we do
truncation in those older releases; instead of happening at a
restartpoint, it happens when oldestMultiXid advances.  I admit that I
don't see a specific way that that can go wrong, but there are so many
different old versions with slightly different multixact truncation
behaviors that it seems very hard to be sure that we're not going to
make things worse rather than better by introducing yet another
approach to the problem.  I realize that you disagree and will
probably commit this to those branches anyway. But I want it to be
clear that I don't endorse that.

I wish more people were paying attention to these patches.  These are
critical data-corrupting bugs, the code in question is very tricky,
it's been majorly revised multiple times, and we're revising it again.
And nobody except me and Andres is looking at this, and I'm definitely
not smart enough to get this all right.

Other issues:
- If SlruDeleteSegment fails in unlink(), shouldn't we at the very
least log a message?  If that file is still there when we loop back
around, it's going to cause a failure, I think.

Assorted minor nitpicking:
- "happend" is misspelled in the commit message for 0003
- "in contrast to before" should have a comma after it, also in that
commit message
- "how far the next members wraparound is away" -> "how far away the
next members wraparound is"
- "seing" -> "seeing"
- "Upgrade the primary," -> "Upgrade the primary;"
- "toMultiXact" -> "to MultiXact"

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Amit Kapila
On Fri, Sep 18, 2015 at 5:31 PM, Amit Kapila 
wrote:
>
> On Thu, Sep 17, 2015 at 4:44 PM, Robert Haas 
wrote:
> >
> > I haven't studied the planner logic in enough detail yet to have a
> > clear opinion on this.  But what I do think is that this is a very
> > good reason why we should bite the bullet and add outfuncs/readfuncs
> > support for all Plan nodes.  Otherwise, we're going to have to scan
> > subplans for nodes we're not expecting to see there, which seems
> > silly.  We eventually want to allow all of those nodes in the worker
> > anyway.
> >
>
> makes sense to me.  There are 39 plan nodes and it seems we have
> support for all of them in outfuncs and needs to add for most of them
> in readfuncs.
>

Attached patch (read_funcs_v1.patch) contains support for all the plan
and other nodes (like SubPlan which could be required for worker) except
CustomScan node.  CustomScan contains TextOutCustomScan and doesn't
contain corresponding Read function pointer, we could add the support for
same, but I am not sure if CustomScan is required to be passed to worker
in near future, so I am leaving it for now.

To verify the patch, I have done 2 things, first I have added elog to
the newly supported read funcs and then in planner, I have used
nodeToString and stringToNode on planned_stmt and then used the
newly generated planned_stmt for further execution.  After making these
changes, I have ran make check-world and ensures that it covers all the
newly added nodes.

Note, that as we don't populate funcid's in expressions during read, the
same has to be updated by traversing the tree and updating in different
expressions based on node type.  Attached patch (read_funcs_test_v1)
contains the changes required for testing the patch.  I am not very sure
about what do about some of the ForeignScan fields (fdw_private) in order
to update the funcid as the data in those expressions could be FDW specific.
This is anyway for test, so doesn't matter much, but the same will be
required to support read of ForeignScan node by worker.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


read_funcs_v1.patch
Description: Binary data


read_funcs_test_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hot_standby_feedback default and docs

2015-09-22 Thread Peter Eisentraut
On 9/16/15 5:52 PM, Simon Riggs wrote:
> IMHO the default is the best one at the current time.
> See recovery_min_apply_delay.

The applications of recovery_min_apply_delay are likely to be varied and
specific, so there might not be a general answer to this, but wouldn't
you want hot_standby_feedback on with it?  Because the longer you wait
on the standby, the more likely it is that the primary will clean stuff
away.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Erik Rijkers

On 2015-09-22 20:33, Andrew Dunstan wrote:


now some reactionaries and misogynists are fighting to maintain
that somewhat latter day rule.



That's almost as offensive a qualification as 'feminazi', don't you 
agree?


And it rather makes the gender-avoidance sound as a 
political-correctness undertaking which, by the way, was not the reason 
I objected to it.  (It just sounded strange to me (and apparently a few 
others too))


Most likely the original author never thought about giving offence.  And 
most likely he never did give offence.


And as I said earlier: to me it's enough that a few native speakers have 
said this is the way it should be.



Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-09-22 Thread Peter Eisentraut
On 9/19/15 10:46 AM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 7/23/15 6:39 PM, Tom Lane wrote:
>>> + 2202HEERRCODE_INVALID_TABLESAMPLE_ARGUMENT
>>>invalid_tablesample_argument
>>> + 2202GEERRCODE_INVALID_TABLESAMPLE_REPEAT  
>>>invalid_tablesample_repeat
> 
>> Where did you get these error codes from?  The constants in the SQL
>> standard would map to
> 
>> ERRCODE_INVALID_SAMPLE_SIZE
>> ERRCODE_INVALID_REPEAT_ARGUMENT_IN_A_SAMPLE_CLAUSE
> 
>> Were you looking at a different standard, or did you intentionally
>> choose to rephrase?
> 
> I was looking at SQL:2011.  My concern in naming them that way was that
> I wanted to have errcodes that would be general enough for any tablesample
> extension to use, but still be tablesample-specific, ie I don't want them
> to have to fall back on say ERRCODE_INVALID_PARAMETER_VALUE.

Makes sense.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 01:43 PM, Mark Dilger wrote:

On Sep 22, 2015, at 6:09 AM, Andrew Dunstan  wrote:

You are fighting a losing battle. Think of they/them/their/theirs as being 
indefinitely gendered third person singular pronouns, as well as being third 
person plural pronouns. Yes it's a relatively new usage, but I don't think its 
at all unreasonable (speaking as someone who has been known to dislike some new 
usages and neologisms). It's not at all sloppy. On the contrary, it's quite 
deliberate. It's just not quite traditional. You need to get over that.

The use of "their" as singular dates back at least as far as Chaucer in the 
14th century, prior to
the use of "you" as a singular pronoun.  Militant grammarian schoolteachers may 
have told you
not to use it that way, but that doesn't change the history of its use.



[recipient list trimmed]

Good point. In the 18th and 19th centuries it was deemed by some 
grammarians to be incorrect for some reason, (and yet Thackeray still 
used it in Vanity Fair, for instance) and now some reactionaries and 
misogynists are fighting to maintain that somewhat latter day rule. But 
I'm pretty certain their numbers will dwindle, as they preach to an ever 
shrinking choir.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 1:57 PM, Andres Freund  wrote:
> On 2015-09-22 13:38:58 -0400, Robert Haas wrote:
>> Regarding 0003, I'm still very much not convinced that it's a good
>> idea to apply this to 9.3 and 9.4.  This patch changes the way we do
>> truncation in those older releases; instead of happening at a
>> restartpoint, it happens when oldestMultiXid advances.
>
> The primary reason for doing that is that doing it at restartpoints is
> simply *wrong*. Restartpoints aren't scheduled in sync with replay -
> which means that a restartpoint can (will actually) happen long long
> after the checkpoint from the primary has replayed.  Which means that by
> the time the restartpoint is performed it's actually not unlikely that
> we've already filled all slru segments. Which is bad if we then fail
> over/start up.

1. It would be possible to write a patch that included ONLY the
changes needed to make that happen, and did nothing else.  It would be
largely a subset of this.  If we want to change 9.3 and 9.4, I
recommend we do that first, and then come back to the rest of this.

2. I agree that what we're doing right now is wrong.  And I agree that
this fixes a real problem. But it seems to me to be quite possible,
even likely, that it will create other problems.

For example, suppose that there are files in the data directory that
precede oldestMultiXact. In the current approach, we'll remove those
because they're not in the range we expect to be used.  But in this
approach we no longer remove everything we think shouldn't be there.
We remove exactly the stuff we think should go away.  As a general
principle, that's clearly superior.  But in the back-branches, it
creates a risk: a leftover old file that doesn't get removed the first
time through - for whatever reason - becomes a time bomb that will
explode on the next wraparound.  I don't know that that will happen.
But I sure as heck don't know that won't happen with any combination
of the variously broken 9.3.X releases we've put out there.  Even if
you can prove that particular risk never materializes to your
satisfaction and mine, I will bet you a beer that there are other
possible hazards neither of us is foreseeing right now.

>> I realize that you disagree and will probably commit this to those
>> branches anyway. But I want it to be clear that I don't endorse that.
>
> I don't plan to commit/backpatch this over your objection.

I'm not in a position to demand that you take my advice, but I'm
telling you what I think as honestly as I know how.

To be clear, I am fully in favor of making these changes (without the
legacy truncation stuff) in 9.5 and master, bumping WAL page magic so
that we invalidate any 9.5 alpha standys.  I think it's a far more
solid approach than what we've got right now, and it clearly
eliminates a host of dangers.  In fact, I think it would be a pretty
stupid idea not to make these changes in those branches.  It would be
doubling down on a design we know can never be made robust.

But I do not have confidence that we can change 9.4 and especially 9.3
without knock-on consequences.  You may have that confidence.  I most
definitely do not.  My previous two rounds in the boxing ring with
this problem convinced me that (1) it's incredibly easy to break
things with well-intentioned changes in this area, (2) it's
practically impossible to foresee everything that might go wrong with
some screwy combination of versions, and (3) early 9.3.X releases are
in much worse shape than early 9.4.X releases, to the point where
guessing what any given variable is going to contain on 9.3.X is
essentially throwing darts at the wall.  That's an awfully challenging
environment in which to write a bullet-proof patch.

>> - If SlruDeleteSegment fails in unlink(), shouldn't we at the very
>> least log a message?  If that file is still there when we loop back
>> around, it's going to cause a failure, I think.
>
> The existing unlink() call doesn't, that's the only reason I didn't add
> a message there. I'm fine with adding a (LOG or WARNING?) message.

Cool.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Calculage avg. width when operator = is missing

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 12:16 PM, Shulgin, Oleksandr wrote:

Hi Hackers,

I've recently stumbled upon a problem with table bloat estimation in 
case there are columns of type JSON.


The quick bloat estimation queries use sum over pg_statistic.stawidth 
of table's columns, but in case of JSON the corresponding entry is 
never created by the ANALYZE command due to equality comparison 
operator missing.  I understand why there is no such operator defined 
for this particular type, but shouldn't we still try to produce 
meaningful average width estimation?


In my case the actual bloat is around 40% as verified with 
pgstattuple, while the bloat reported by quick estimate can be between 
75% and 95%(!) in three instances of this problem.  We're talking 
about some hundreds of GB of miscalculation.


Attached patch against master makes the std_typanalyze still try to 
compute the minimal stats even if there is no "=" operator.  Makes sense?


I could also find this report in archives that talks about similar 
problem, but due to all values being over the analyze threshold:


http://www.postgresql.org/message-id/flat/12480.1389370...@sss.pgh.pa.us#12480.1389370...@sss.pgh.pa.us

I think we could try harder, otherwise any estimate relying on average 
width can be way off in such cases.



Yes, "/revenons/ à /nos moutons/." You can set up text based comparison 
ops fairly easily for json - you just need to be aware of the 
limitations. See https://gist.github.com/adunstan/32ad224d7499d2603708


But I agree we should be able to do some analysis of types without 
comparison ops.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Calculage avg. width when operator = is missing

2015-09-22 Thread Shulgin, Oleksandr
On Sep 22, 2015 8:58 PM, "Andrew Dunstan"  wrote:
>
>
>
> On 09/22/2015 12:16 PM, Shulgin, Oleksandr wrote:
>>
>> Hi Hackers,
>>
>> I've recently stumbled upon a problem with table bloat estimation in
case there are columns of type JSON.
>>
>> The quick bloat estimation queries use sum over pg_statistic.stawidth of
table's columns, but in case of JSON the corresponding entry is never
created by the ANALYZE command due to equality comparison operator
missing.  I understand why there is no such operator defined for this
particular type, but shouldn't we still try to produce meaningful average
width estimation?
>>
>> In my case the actual bloat is around 40% as verified with pgstattuple,
while the bloat reported by quick estimate can be between 75% and 95%(!) in
three instances of this problem.  We're talking about some hundreds of GB
of miscalculation.
>>
>> Attached patch against master makes the std_typanalyze still try to
compute the minimal stats even if there is no "=" operator.  Makes sense?
>>
>> I could also find this report in archives that talks about similar
problem, but due to all values being over the analyze threshold:
>>
>>
http://www.postgresql.org/message-id/flat/12480.1389370...@sss.pgh.pa.us#12480.1389370...@sss.pgh.pa.us
>>
>> I think we could try harder, otherwise any estimate relying on average
width can be way off in such cases.
>
> Yes, "/revenons/ à /nos moutons/." You can set up text based comparison
ops fairly easily for json - you just need to be aware of the limitations.
See https://gist.github.com/adunstan/32ad224d7499d2603708

Yes, I've already tried this approach and have found that analyze
performance degrades an order of magnitude due to sql-level function
overhead and casts to text.  In my tests, from 200ms to 2000ms with btree
ops on a default sample of 30,000 rows.

Should have mentioned that.

There is a very hacky way to substitute bttextcmp for the sort support
function after defining the opclass by updating pg_amproc, buy I would
rather avoid that. :-)

--
Alex


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 2:33 PM, Andrew Dunstan  wrote:
> Good point. In the 18th and 19th centuries it was deemed by some grammarians
> to be incorrect for some reason, (and yet Thackeray still used it in Vanity
> Fair, for instance) and now some reactionaries and misogynists are fighting
> to maintain that somewhat latter day rule. But I'm pretty certain their
> numbers will dwindle, as they preach to an ever shrinking choir.

I agree that this construction is grammatically acceptable in many if
not all cases, but I still think that phrasing the sentences to avoid
this construction is a good idea where we can do it easily.  For
example, this is clearly a good idea:

 So the database administrator can
 decide which languages are available in which databases and can make
-some languages available by default if he chooses.
+some languages available by default if desired.

And so is this, which just gets rid of a sentence that really isn't needed:

 Possibly, your site administrator has already created a database
-for your use.  He should have told you what the name of your
-database is.  In that case you can omit this step and skip ahead
+for your use.  In that case you can omit this step and skip ahead
 to the next section.

But consider this one:

-return any user name he chooses. This authentication method is
+return any user name they choose. This authentication method is

You could say "any arbitrary user name" or "any username whatsoever".

Or here:

   or within a session via the SET command.  Any user is
-  allowed to change his session-local value.  Changes in
+  allowed to change their session-local value.  Changes in

You could say "This requires no special privileges".  This isn't
really an exact rewrite of the sentence, but in context it means the
same thing.

Or here:

--- Who works for us when she must pay for it?
+-- Who works for us when they must pay for it?

You could say "-- We pay employees; they don't pay us."

I don't think any of these changes are outright wrong except for
"might not be the same as the database user that is to be connect as",
which seems like a muddle.  But I think some of them could be changed
to use other wording that would read more smoothly.

Of course, that is just my opinion, and I clearly feel a lot less
strongly about this than some other people.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Joshua D. Drake

Hello,

-  environment variable); any user can make such a change for his 
session.
+  environment variable); any user can make such a change for the 
session.


Or

+ environment variable); any user can make such a change for the 
connected session.


-  allowed to change his session-local value.  Changes in
+  allowed to change the connected session-local value.  Changes in

-   might not be the same as the database user he needs to connect as.
+   might not be the same as the database user that one is connected as.

-return any user name he chooses. This authentication method is
+return any user name one may choose. This authentication method is


etc...

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Garick Hamlin
On Tue, Sep 22, 2015 at 11:59:21AM -0400, Robert Haas wrote:
> On Tue, Sep 22, 2015 at 11:00 AM, Andrew Dunstan  wrote:
> >> Happily for me, I can continue to write documents in a grammatically
> >> correct way, and no-one will read them and think I'm a grammar-nazi (or
> >> obstinate, or old-fashioned or whatever) because unless they're 
> >> specifically
> >> looking for it no-one will notice that I'm avoiding the contentious usage
> >> altogether. On the other hand, there _will_ be a (perhaps significant)
> >> proportion of people who read your documents and think that you're 
> >> incapable
> >> of writing a grammatically correct sentence.
> >
> > Wow, 1960s feminazis, eh? I originally thought you were just a narrow
> > minded, pedantic and antiquated grammarian. Now I realize that's the least
> > of your troubles. Please take your misogyny elsewhere. I hear the Rabid
> > Puppies have openings.
> 
> I think this discussion could benefit from a little more light and a
> lot less heat.
> 
> Here's my contribution: the use of they does sometimes seek awkward.
> However, it's not remotely new:
> 
> https://stroppyeditor.wordpress.com/2015/04/21/everything-you-ever-wanted-to-know-about-singular-they/
> https://en.wikipedia.org/wiki/Singular_they#Older_usage_by_respected_authors
> http://englishbibles.blogspot.com/2006/09/singular-they-in-english-bibles.html
> 
> And I do think it's generally worthwhile to avoid the use of "he"
> where possible.  Would I have done it exactly the way that Peter did
> it here?  Uh, no.  Is it better than not doing anything?  In my
> opinion, yes.

I agree, I think we should avoid gendered pronouns.  

Also, the modern use of they/their absolutely fits here.  It reflects a
deliberate considered choice of of the writer to be inclusive and correct.

Constructs like 'he or she' exclude people.  Enumerating genders is not 
inclusive.  It leads to exclusion and erasure of people who have an non-binary
identities. 

Garick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Andres Freund
On 2015-09-22 13:38:58 -0400, Robert Haas wrote:
> Regarding 0003, I'm still very much not convinced that it's a good
> idea to apply this to 9.3 and 9.4.  This patch changes the way we do
> truncation in those older releases; instead of happening at a
> restartpoint, it happens when oldestMultiXid advances.

The primary reason for doing that is that doing it at restartpoints is
simply *wrong*. Restartpoints aren't scheduled in sync with replay -
which means that a restartpoint can (will actually) happen long long
after the checkpoint from the primary has replayed.  Which means that by
the time the restartpoint is performed it's actually not unlikely that
we've already filled all slru segments. Which is bad if we then fail
over/start up.

Aside from the more fundamental issue that restartpoints have to be
"asynchronous" with respect to the checkpoint record for performance
reasons, there's a bunch of additional reasons making this even more
likely to occur: Differing checkpoint segments on the standby and
pending actions (which we got rid off in 9.5+, but ...)

> I realize that you disagree and will probably commit this to those
> branches anyway. But I want it to be clear that I don't endorse that.

I don't plan to commit/backpatch this over your objection.

I do think it'd be the better approach, and I personally think that
we're much more likely to introduce bugs if we backpatch this in a
year. Which I think we'll end up having to. The longer people run on
these branches, the more issues we'll see.

> I wish more people were paying attention to these patches.

+many

> Other issues:
> - If SlruDeleteSegment fails in unlink(), shouldn't we at the very
> least log a message?  If that file is still there when we loop back
> around, it's going to cause a failure, I think.

The existing unlink() call doesn't, that's the only reason I didn't add
a message there. I'm fine with adding a (LOG or WARNING?) message.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Alvaro Herrera
Robert Haas wrote:

> Regarding 0003, I'm still very much not convinced that it's a good
> idea to apply this to 9.3 and 9.4.  This patch changes the way we do
> truncation in those older releases; instead of happening at a
> restartpoint, it happens when oldestMultiXid advances.  I admit that I
> don't see a specific way that that can go wrong, but there are so many
> different old versions with slightly different multixact truncation
> behaviors that it seems very hard to be sure that we're not going to
> make things worse rather than better by introducing yet another
> approach to the problem.  I realize that you disagree and will
> probably commit this to those branches anyway. But I want it to be
> clear that I don't endorse that.

Noted.  I am not sure about changing things so invasively either TBH.
The interactions of this stuff with other parts of the system are very
complicated and it's easy to make a mistake that goes unnoticed until
some weird scenario is run elsewhere.  (Who would have thought that
things would fail when a basebackup takes 12 hours to take and you have
a custom preemptive tuple freeze script in crontab).

> I wish more people were paying attention to these patches.  These are
> critical data-corrupting bugs, the code in question is very tricky,
> it's been majorly revised multiple times, and we're revising it again.
> And nobody except me and Andres is looking at this, and I'm definitely
> not smart enough to get this all right.

I'm also looking, and yes it's tricky.

> Other issues:

It would be good to pgindent the code before producing back-branch
patches.  I think some comments will get changed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Erik Rijkers

On 2015-09-22 19:25, Garick Hamlin wrote:


Constructs like 'he or she' exclude people.



This remains difficult for me to follow but with all the native speakers 
being in favor of this change I for one retract my objections.



thanks,

Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 11:23 AM, Andrew Dunstan  wrote:
> "git bisect" is your friend.

Yeah, but finding someone who has a working Windows build environment
and a lot of time to run this down is my enemy.  We're trying, but if
anyone else has a clue, that would be much appreciated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Peter Geoghegan
On Tue, Sep 22, 2015 at 7:11 AM, Andrew Dunstan  wrote:
> You can think that if you like, but it's not even remotely true. It's a
> deliberate choice to use a new, perfectly reasonable and now widely accepted
> style of which you disapprove, but it's not lazy.

It never occurred to me that this usage is even non-traditional. I am
a native English speaker born in Ireland in the 1980s.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Mark Dilger

> On Sep 22, 2015, at 6:09 AM, Andrew Dunstan  wrote:
> 
> You are fighting a losing battle. Think of they/them/their/theirs as being 
> indefinitely gendered third person singular pronouns, as well as being third 
> person plural pronouns. Yes it's a relatively new usage, but I don't think 
> its at all unreasonable (speaking as someone who has been known to dislike 
> some new usages and neologisms). It's not at all sloppy. On the contrary, 
> it's quite deliberate. It's just not quite traditional. You need to get over 
> that.

The use of "their" as singular dates back at least as far as Chaucer in the 
14th century, prior to 
the use of "you" as a singular pronoun.  Militant grammarian schoolteachers may 
have told you
not to use it that way, but that doesn't change the history of its use.

Mark Dilger



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planstate_tree_walker oversight CustomScan

2015-09-22 Thread Robert Haas
On Mon, Sep 21, 2015 at 9:54 AM, Kouhei Kaigai  wrote:
> The planstate_tree_walker() oversight custom_ps of CustomScanState;
> that should be a list of underlying PlanState object if any.
>
> ExplainPreScanNode() treated ForeignScan and CustomScan in special
> way (it is sufficient for ExplainPreScanNode() purpose), thus, it
> didn't implement its recursive portion originally.
>
> The job of ExplainPreScanNode() is know all the relids involved
> in a particular subquery execution. On the other hands, fs_relids
> of ForeignScan and custom_relids of CustomScan informs a set of
> relids to be scanned by this Scan node without recursive, so it
> did not have recursive walks on the underlying sub-plans.
>
> However, planstate_tree_walker() will have different expectation.
> It is general walker routine, thus, it is natural users to expect
> the callback is also kicked towards the underlying planstate of
> CustomScan (and ForeignScan; once EPQ recheck gets solved).
>
> The attached patch adds support of CustomScan on the walker.

Do you need to add something to ExplainPreScanNode for this as well,
like if (scanrelid != 0) *rels_used = bms_add_member(*rels_used,
scanrelid)?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 9:18 PM, Amit Kapila  wrote:
> PlannedStmt is needed because we are passing the same from master
> to worker for execution and the reason was that Executor interfaces
> expect it.

I thought we were passing the Plan and then the worker was constructed
a PlannedStmt around it.  If we're passing the PlannedStmt then I
guess we need PlanInvalItem too, since there is a list of those
hanging off of the PlannedStmt.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 9:12 PM, Kouhei Kaigai  wrote:
> Oh... I did exactly duplicated job a few days before.
>   
> https://github.com/kaigai/sepgsql/blob/readfuncs/src/backend/nodes/readfuncs.c

Please post the patch here, and clarify that it is under the PostgreSQL license.

> Regarding of CustomScan node, I'd like to run on worker process as soon as
> possible once it gets supported. I'm highly motivated.

Great.

> Andres raised a related topic a few weeks before:
> http://www.postgresql.org/message-id/20150825181933.ga19...@awork2.anarazel.de
>
> Here are two issues:
>
> * How to reproduce "methods" pointer on another process. Extension may not be
>   loaded via shared_preload_libraries.

The parallel mode stuff already has code to make sure that the same
libraries that were loaded in the original backend get loaded in the
new one.  But that's not going to make the same pointer valid there.

> -> One solution is to provide a pair of library and symbol name of the method
>table, instead of the pointer. I think it is a reasonable idea.

I agree.

> * How to treat additional output of TextOutCustomScan.
> -> Here are two solutions. (1) Mark TextOutCustomScan as an obsolete callback,
>however, it still makes Andres concern because we need to form/deform 
> private
>data for copyObject safe. (2) Add TextReadCustomScan (and 
> NodeEqualCustomScan?)
>callback to process private fields.

I don't see how making it obsolete solves anything.  Any node that
wants to run in a worker needs to have outfuncs and readfuncs support.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a funnel by any other name

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs  wrote:
> Robert, thanks for asking. We'll be stuck with these words for some time,
> user visible via EXPLAIN so this is important.

I agree, thanks for taking an interest.

> The main operations are the 3 mentioned by Nicolas:
> 1. Send data from many to one - which has subtypes for Unsorted, Sorted and
> Evenly balanced (but unsorted)
> 2. Send data from one process to many
> 3. Send data from many to many
>
> My preferences for this would be
> 1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted
> 2. Scatter (since Broadcast only makes sense in the context of a distributed
> query, it sounds weird for intra-node query)
> 3. Redistribution - which implies the description of how we spread data
> across nodes is "Distribution" (or DISTRIBUTED BY)

"Scatter" isn't one of the things that I mentioned in my original
email.  Not sure where we'd use that, although there might be
somewhere.

> For 3 we should definitely use Redistribute, since this is what Teradata has
> been calling it for 30 years, which is where Greenplum got it from.

That's a reasonable option.  We can bikeshed it some more when we get that far.

> For 1, Gather makes most sense.

Yeah, I'm leaning that way myself.  Amit argued for "Parallel Gather"
but I think that's overkill.  There can't be a non-parallel gather,
and long names are a pain.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planstate_tree_walker oversight CustomScan

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 9:30 PM, Kouhei Kaigai  wrote:
> The latest ExplainPreScanNode is sufficient. Regardless of scanrelid
> (even if it is zero), fs_relids and custom_relids shall be set properly
> to introduce which relations are represented by this ForeignScan and
> CustomScan node. So, additional planstate_tree_walker() call might be
> a bit redundant, but harmless.
>
> The reason why ForeignScan/CustomScan node have these bitmap is, we
> cannot guarantee they always have underlying scan node. For example,
> ForeignScan that kicks remote join query will not have local underlying
> scan node on the foreign tables to be involved.
> So, we had to inform ExplainPreScanNode which relids are represented
> by this Foreign/CustomScan node.

OK, I've just committed the patch the way it is for now, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Peter Geoghegan
On Tue, Sep 22, 2015 at 6:55 PM, Alvaro Herrera
 wrote:
> So if I have multiple queries like
>
> SELECT foo FROM bar WHERE baz IN (a, b)
> SELECT foo FROM bar WHERE baz IN (a, b, c)
>
> they are not normalized down to the same?  That seems odd.

Yes, although in practice it's usually down to a variable number of
constants appearing within the "IN ( )", which is more odd IMV.

We discussed changing this before. I don't have strong feelings either way.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a funnel by any other name

2015-09-22 Thread Alvaro Herrera
Robert Haas wrote:
> On Tue, Sep 22, 2015 at 10:34 AM, Simon Riggs  wrote:

> > For 1, Gather makes most sense.
> 
> Yeah, I'm leaning that way myself.  Amit argued for "Parallel Gather"
> but I think that's overkill.  There can't be a non-parallel gather,
> and long names are a pain.

"Gather" seems a pretty decent choice to me too, even if we only have a
single worker (your "1").  I don't think there's much need to
distinguish 1 from 2, is there?

We can bikeshed the other names when the time comes; the insight in the
thread is good to have.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Kouhei Kaigai
> On Tue, Sep 22, 2015 at 9:12 PM, Kouhei Kaigai  wrote:
> > Oh... I did exactly duplicated job a few days before.
> >
> https://github.com/kaigai/sepgsql/blob/readfuncs/src/backend/nodes/readfuncs
> .c
> 
> Please post the patch here, and clarify that it is under the PostgreSQL 
> license.
>
Of course. I intend to submit.

> > Regarding of CustomScan node, I'd like to run on worker process as soon as
> > possible once it gets supported. I'm highly motivated.
> 
> Great.
> 
> > Andres raised a related topic a few weeks before:
> >
> http://www.postgresql.org/message-id/20150825181933.GA19326@awork2.anarazel.
> de
> >
> > Here are two issues:
> >
> > * How to reproduce "methods" pointer on another process. Extension may not 
> > be
> >   loaded via shared_preload_libraries.
> 
> The parallel mode stuff already has code to make sure that the same
> libraries that were loaded in the original backend get loaded in the
> new one.  But that's not going to make the same pointer valid there.
> 
> > -> One solution is to provide a pair of library and symbol name of the 
> > method
> >table, instead of the pointer. I think it is a reasonable idea.
> 
> I agree.
> 
> > * How to treat additional output of TextOutCustomScan.
> > -> Here are two solutions. (1) Mark TextOutCustomScan as an obsolete 
> > callback,
> >however, it still makes Andres concern because we need to form/deform 
> > private
> >data for copyObject safe. (2) Add TextReadCustomScan (and
> NodeEqualCustomScan?)
> >callback to process private fields.
> 
> I don't see how making it obsolete solves anything.  Any node that
> wants to run in a worker needs to have outfuncs and readfuncs support.
>
In actually, I'm inclined to the (2) rather than (1).
In case of (2), we shall need two new callbacks on _copyCustomScan and
_readCustomScan. I'll try to make up.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] unclear about row-level security USING vs. CHECK

2015-09-22 Thread Peter Eisentraut
I'm testing the new row-level security feature.  I'm not clear on the
difference between the USING and CHECK clauses in the CREATE POLICY
statement.

The documentation says:

"""
A policy grants the ability to SELECT, INSERT, UPDATE, or DELETE rows
which match the relevant policy expression. Existing table rows are
checked against the expression specified via USING, while new rows that
would be created via INSERT or UPDATE are checked against the expression
specified via WITH CHECK. When a USING expression returns true for a
given row then that row is visible to the user, while if a false or null
is returned then the row is not visible. When a WITH CHECK expression
returns true for a row then that row is added, while if a false or null
is returned then an error occurs.
"""

So basically, USING filters out what you see, CHECK controls what you
can write.

But then this doesn't work correctly:

CREATE TABLE test1 (content text, entered_by text);
ALTER TABLE test1 ENABLE ROW LEVEL SECURITY;
CREATE POLICY test1_policy ON test1 FOR ALL TO PUBLIC USING (entered_by
= current_user);
GRANT ALL ON TABLE test1 TO PUBLIC;

CREATE USER foo1;
SET SESSION AUTHORIZATION foo1;
INSERT INTO test1 VALUES ('blah', 'foo2');  -- fails

This is a typical you-can-only-see-your-own-rows setup, which works for
the reading case, but it evidently also controls writes.  So I'm not
sure what the CHECK clause is supposed to add on top of that.

(Btw., what's the meaning of a policy for DELETE?)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Peter Geoghegan
On Tue, Sep 22, 2015 at 5:01 PM, Peter Geoghegan  wrote:
> My guess is that this very large query involved a very large number of
> constants, possibly contained inside an " IN ( )". Slight variants of
> the same query, that a human would probably consider to be equivalent
> have caused artificial pressure on garbage collection.

I could write a patch to do compaction in-place. The basic idea is
that there'd be a slow path in the event of an OOM-like condition
(i.e. an actual OOM, or when the MaxAllocSize limitation is violated)
that first scans through entries, and determines the exact required
buffer size for every non-garbage query text. As this
iteration/scanning occurs, the entries' offsets in shared memory are
rewritten assuming that the first entry starts at 0, the second at 0 +
length of first + 1 (for NUL sentinal byte), and so on. We then
allocate a minimal buffer, lseek() and copy into the buffer, so that
the expectation of finding query texts at those offsets is actually
met. Finally, unlink() old file, create new one, and write new buffer
out. I think I wanted to do things that way originally.

If even that exact, minimal buffer size cannot be allocated, then ISTM
that the user is out of luck. That will be very rare in practice, but
should it occur we log the issue and give up on storing query texts
entirely, so as to avoid thrashing while still giving the user
something to go on. This new code path is never hit until a garbage
collection is required, so hopefully the garbage created was not a
pathological issue with a weird workload, but rather something that
will not recur for a very long time.

That seems to me to be better than getting into the business of
deciding how long of a query text is too long.

I'm doubtful that this had anything to do with MaxAllocSize. You'd
certainly need a lot of bloat to be affected by that in any way. I
wonder how high pg_stat_statements.max was set to on this system, and
how long each query text was on average.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Amit Kapila
On Wed, Sep 23, 2015 at 5:42 AM, Robert Haas  wrote:
>
> On Tue, Sep 22, 2015 at 3:21 PM, Amit Kapila 
wrote:

>
> readAttrNumberCols uses sizeof(Oid) instead of sizeof(AttrNumber).
>
> I still don't understand why we need to handle PlanInvalItem.

As such this is not required, just to maintain consistency as I have added
other similar nodes like PlanRowMark and NestLoopParam.

>
> Actually, come to think of it, I'm not sure we need PlannedStmt
> either.
>

PlannedStmt is needed because we are passing the same from master
to worker for execution and the reason was that Executor interfaces
expect it.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] unclear about row-level security USING vs. CHECK

2015-09-22 Thread Charles Clavadetscher
Hello Peter

> I'm testing the new row-level security feature.  I'm not clear on the
> difference between the USING and CHECK clauses in the CREATE POLICY
> statement.
> 
> The documentation says:
> 
> """
> A policy grants the ability to SELECT, INSERT, UPDATE, or DELETE rows
> which match the relevant policy expression. Existing table rows are
> checked against the expression specified via USING, while new rows that
> would be created via INSERT or UPDATE are checked against the expression
> specified via WITH CHECK. When a USING expression returns true for a
> given row then that row is visible to the user, while if a false or null
> is returned then the row is not visible. When a WITH CHECK expression
> returns true for a row then that row is added, while if a false or null
> is returned then an error occurs.
> """
> 
> So basically, USING filters out what you see, CHECK controls what you
> can write.

Yes, for the command that you specified in the FOR clause. This is quite 
important if you need different conditions for different commands, e.g. see all 
rows, modify only some.

This may help to better understand how this is meant:
http://www.postgresql.org/message-id/20150711132144.gs12...@tamriel.snowman.net

> But then this doesn't work correctly:
> 
> CREATE TABLE test1 (content text, entered_by text);
> ALTER TABLE test1 ENABLE ROW LEVEL SECURITY;
> CREATE POLICY test1_policy ON test1 FOR ALL TO PUBLIC USING (entered_by
> = current_user);
> GRANT ALL ON TABLE test1 TO PUBLIC;
> 
> CREATE USER foo1;
> SET SESSION AUTHORIZATION foo1;
> INSERT INTO test1 VALUES ('blah', 'foo2');  -- fails
> 
> This is a typical you-can-only-see-your-own-rows setup, which works for
> the reading case, but it evidently also controls writes. So I'm not
> sure what the CHECK clause is supposed to add on top of that.

Since the policy is defined for ALL commands and no WITH CHECK is specified 
then the same condition defined in USING takes effect for all commands, i.e. 
including INSERT.
 
>From the docs 
>(http://www.postgresql.org/docs/9.5/static/sql-createpolicy.html): "Further, 
>for commands which can have both USING and WITH CHECK policies (ALL and 
>UPDATE), if no WITH CHECK policy is defined then the USING policy will be used 
>for both what rows are visible (normal USING case) and which rows will be 
>allowed to be added (WITH CHECK case)."

If you want e.g. to allow users to insert rows without the restriction of being 
the current_user in column entered_by then you would need separate policies for 
each command. If you define a policy for INSERT, USING does not make sense. In 
the thread above there is a similar example to this as well as in the 
documentation:

http://www.postgresql.org/docs/9.5/static/ddl-rowsecurity.html

> (Btw., what's the meaning of a policy for DELETE?)

In your example it means that users can delete only the rows where entered_by = 
current_user. A WITH CHECK policy does not make sense in this case.

I assume that having USING and WITH CHECK for filtering and controlling added 
rows was introduced for use cases where these conditions are not the same, i.e. 
to allow for more flexibility. On the spot I don't have an example, but maybe 
somebody else can deliver one.

Regards
Charles




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Kouhei Kaigai
> > On Thu, Sep 17, 2015 at 4:44 PM, Robert Haas  wrote:
> > >
> > > I haven't studied the planner logic in enough detail yet to have a
> > > clear opinion on this.  But what I do think is that this is a very
> > > good reason why we should bite the bullet and add outfuncs/readfuncs
> > > support for all Plan nodes.  Otherwise, we're going to have to scan
> > > subplans for nodes we're not expecting to see there, which seems
> > > silly.  We eventually want to allow all of those nodes in the worker
> > > anyway.
> > >
> >
> > makes sense to me.  There are 39 plan nodes and it seems we have
> > support for all of them in outfuncs and needs to add for most of them
> > in readfuncs.
> >
> 
> Attached patch (read_funcs_v1.patch) contains support for all the plan
> and other nodes (like SubPlan which could be required for worker) except
> CustomScan node.  CustomScan contains TextOutCustomScan and doesn't
> contain corresponding Read function pointer, we could add the support for
> same, but I am not sure if CustomScan is required to be passed to worker
> in near future, so I am leaving it for now.
>
Oh... I did exactly duplicated job a few days before.
  https://github.com/kaigai/sepgsql/blob/readfuncs/src/backend/nodes/readfuncs.c

Regarding of CustomScan node, I'd like to run on worker process as soon as
possible once it gets supported. I'm highly motivated.

Andres raised a related topic a few weeks before:
http://www.postgresql.org/message-id/20150825181933.ga19...@awork2.anarazel.de

Here are two issues:

* How to reproduce "methods" pointer on another process. Extension may not be
  loaded via shared_preload_libraries.
-> One solution is to provide a pair of library and symbol name of the method
   table, instead of the pointer. I think it is a reasonable idea.

* How to treat additional output of TextOutCustomScan.
-> Here are two solutions. (1) Mark TextOutCustomScan as an obsolete callback,
   however, it still makes Andres concern because we need to form/deform private
   data for copyObject safe. (2) Add TextReadCustomScan (and 
NodeEqualCustomScan?)
   callback to process private fields.

> To verify the patch, I have done 2 things, first I have added elog to
> the newly supported read funcs and then in planner, I have used
> nodeToString and stringToNode on planned_stmt and then used the
> newly generated planned_stmt for further execution.  After making these
> changes, I have ran make check-world and ensures that it covers all the
> newly added nodes.
> 
> Note, that as we don't populate funcid's in expressions during read, the
> same has to be updated by traversing the tree and updating in different
> expressions based on node type.  Attached patch (read_funcs_test_v1)
> contains the changes required for testing the patch.  I am not very sure
> about what do about some of the ForeignScan fields (fdw_private) in order
> to update the funcid as the data in those expressions could be FDW specific.
> This is anyway for test, so doesn't matter much, but the same will be
> required to support read of ForeignScan node by worker.
>
Because of interface contract, it is role of FDW driver to put nodes which
are safe to copyObject on fdw_exprs and fdw_private field. Unless FDW driver
does not violate, fdw_exprs and fdw_private shall be reproduced on the worker
side. (Of course, we cannot guarantee nobody has local pointer on private
field...)
Sorry, I cannot understand the sentence of funcid population. It seems to me
funcid is displayed as-is, and _readFuncExpr() does nothing special...?


Robert Haas said:
> I think it would be worth doing something like this:
> 
> #define READ_ATTRNUMBER_ARRAY(fldname, len) \
> token = pg_strtok(); \
> local_node->fldname = readAttrNumberCols(len);
> 
> And similarly for READ_OID_ARRAY, READ_BOOL_ARRAY, READ_INT_ARRAY.
>
Like this?
https://github.com/kaigai/sepgsql/blob/readfuncs/src/backend/nodes/readfuncs.c#L133

I think outfuncs.c also have similar macro to centralize the format of array.
Actually, most of boolean array are displayed using booltostr(), however, only
_outMergeJoin() uses "%d" format to display boolean as integer.
It is a bit inconsistent manner.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planstate_tree_walker oversight CustomScan

2015-09-22 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: Wednesday, September 23, 2015 10:15 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] planstate_tree_walker oversight CustomScan
> 
> On Mon, Sep 21, 2015 at 9:54 AM, Kouhei Kaigai  wrote:
> > The planstate_tree_walker() oversight custom_ps of CustomScanState;
> > that should be a list of underlying PlanState object if any.
> >
> > ExplainPreScanNode() treated ForeignScan and CustomScan in special
> > way (it is sufficient for ExplainPreScanNode() purpose), thus, it
> > didn't implement its recursive portion originally.
> >
> > The job of ExplainPreScanNode() is know all the relids involved
> > in a particular subquery execution. On the other hands, fs_relids
> > of ForeignScan and custom_relids of CustomScan informs a set of
> > relids to be scanned by this Scan node without recursive, so it
> > did not have recursive walks on the underlying sub-plans.
> >
> > However, planstate_tree_walker() will have different expectation.
> > It is general walker routine, thus, it is natural users to expect
> > the callback is also kicked towards the underlying planstate of
> > CustomScan (and ForeignScan; once EPQ recheck gets solved).
> >
> > The attached patch adds support of CustomScan on the walker.
> 
> Do you need to add something to ExplainPreScanNode for this as well,
> like if (scanrelid != 0) *rels_used = bms_add_member(*rels_used,
> scanrelid)?
>
The latest ExplainPreScanNode is sufficient. Regardless of scanrelid
(even if it is zero), fs_relids and custom_relids shall be set properly
to introduce which relations are represented by this ForeignScan and
CustomScan node. So, additional planstate_tree_walker() call might be
a bit redundant, but harmless.

The reason why ForeignScan/CustomScan node have these bitmap is, we
cannot guarantee they always have underlying scan node. For example,
ForeignScan that kicks remote join query will not have local underlying
scan node on the foreign tables to be involved.
So, we had to inform ExplainPreScanNode which relids are represented
by this Foreign/CustomScan node.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Alvaro Herrera
Peter Geoghegan wrote:

> My guess is that this very large query involved a very large number of
> constants, possibly contained inside an " IN ( )". Slight variants of
> the same query, that a human would probably consider to be equivalent
> have caused artificial pressure on garbage collection.

So if I have multiple queries like

SELECT foo FROM bar WHERE baz IN (a, b)
SELECT foo FROM bar WHERE baz IN (a, b, c)

they are not normalized down to the same?  That seems odd.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Christopher Browne
On 22 September 2015 at 15:11, Robert Haas  wrote:
>
> On Tue, Sep 22, 2015 at 2:33 PM, Andrew Dunstan 
wrote:
> > Good point. In the 18th and 19th centuries it was deemed by some
grammarians
> > to be incorrect for some reason, (and yet Thackeray still used it in
Vanity
> > Fair, for instance) and now some reactionaries and misogynists are
fighting
> > to maintain that somewhat latter day rule. But I'm pretty certain their
> > numbers will dwindle, as they preach to an ever shrinking choir.
>
> I agree that this construction is grammatically acceptable in many if
> not all cases, but I still think that phrasing the sentences to avoid
> this construction is a good idea where we can do it easily.  For
> example, this is clearly a good idea:
>
>  So the database administrator can
>  decide which languages are available in which databases and can make
> -some languages available by default if he chooses.
> +some languages available by default if desired.
>
> And so is this, which just gets rid of a sentence that really isn't
needed:
>
>  Possibly, your site administrator has already created a database
> -for your use.  He should have told you what the name of your
> -database is.  In that case you can omit this step and skip ahead
> +for your use.  In that case you can omit this step and skip ahead
>  to the next section.
>
> But consider this one:
>
> -return any user name he chooses. This authentication method is
> +return any user name they choose. This authentication method is
>
> You could say "any arbitrary user name" or "any username whatsoever".

Those all seem like they might improve the combination of
specificity and non-specificity.

Almost certainly no one intended to indicate that the administrator was
specifically male or female, and removing an indicator diminishes some
potential for confusion.

I'll throw in, for good measure, that "users" are not necessarily even
*human*; it is common for users to get attached to applications, and
the applications (presumably!) haven't got any gender.

I could visit French for a moment, where all nouns are expected to
have gender.  (Where "la" indicates a "female she-ness", and "le"
indicates "masculinity.")

"La chaise est féminin, comme la table, alors que le sol est masculin."

The chair is feminine, as the table, however the floor is masculine.

And the explanation of the gendering of third person pronouns
(ils versus elles) always seemed very strange to me.

I think that using "he or she" (as has been suggested) heads down a
questionable path, as that's demanding (in somewhat the French
fashion!) a defined set of gender indicates.  That would properly head,
in a baroque "PC" context, to sillyness like...

"The user should do [something]; he or she or it, or the cis/trans/asexual
personage or connection used by a web application, whether written in
C, Java, Perl, PHP, running on Windows or Unix or ..." as the increasing
inclusions heads to some pathological limit.

> Or here:
>
>or within a session via the SET command.  Any user is
> -  allowed to change his session-local value.  Changes in
> +  allowed to change their session-local value.  Changes in
>
> You could say "This requires no special privileges".  This isn't
> really an exact rewrite of the sentence, but in context it means the
> same thing.

Notice that the changes you are suggesting tend to actually *shorten* the
text!  I like that.

> Or here:
>
> --- Who works for us when she must pay for it?
> +-- Who works for us when they must pay for it?
>
> You could say "-- We pay employees; they don't pay us."
>
> I don't think any of these changes are outright wrong except for
> "might not be the same as the database user that is to be connect as",
> which seems like a muddle.  But I think some of them could be changed
> to use other wording that would read more smoothly.
>
> Of course, that is just my opinion, and I clearly feel a lot less
> strongly about this than some other people.

I'd be pleased to see (perhaps even help) patches to the documentation
that make it read better and perhaps "more kindly."

Doing a "let's run through and substitute some legalistic wording in
order to be politically correct" will irritate people; instead, make the
documentation *better*.  Replacing "he" with "he/she/cis/trans/Unix/Windows"
(or some such) wouldn't make it *better*.  (And probably that's a phrase
that's missing some legalistic wherefores whereases!)

-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Geoff Winkless
On 22 September 2015 at 21:22, David Steele  wrote:

> I think conversations like this are a part of why we have trouble
> attracting new contributors (of any gender) to the community.
>

​It's very clear that my use of the word (which I shan't make the mistake
of repeating!) is not acceptable to many on this list. I ​apologise
unreservedly to anyone I have unintentionally offended by the use of this
word, I shall not do so again; I would only say in my defence that I was
not intending to characterise all feminists in that way, rather to
distinguish the sort of feminists to whom the fact that a text refers to
"man"​ or "he" is offensive from the sort of feminists who I admire and
respect - those who fight for real equality, who use intelligent argument
and who have affected real change for good in my lifetime.

I can clearly see that my use of language has, ironically enough,
invalidated my argument somewhat. I have said my last on the subject​.

Geoff


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Adam Brightwell
> I think conversations like this are a part of why we have trouble attracting
> new contributors (of any gender) to the community.

+1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Gavin Flower

On 23/09/15 08:17, Christopher Browne wrote:
[...]


"The user should do [something]; he or she or it, or the cis/trans/asexual
personage or connection used by a web application, whether written in
C, Java, Perl, PHP, running on Windows or Unix or ..." as the increasing
inclusions heads to some pathological limit.

[...]

order to be politically correct" will irritate people; instead, make the
documentation *better*.  Replacing "he" with 
"he/she/cis/trans/Unix/Windows"

(or some such) wouldn't make it *better*.  (And probably that's a phrase
that's missing some legalistic wherefores whereases!)

[...]

You also have to include "they" as some people have multiple 
personalities, I actually met one (but only - as far as I can tell - saw 
one of them) - in the early 1990's I conversed with several people on 
alt.sexual.abuse.recovery, so got more insights into these types of 
complexities than most people.  I was doing a project in network 
traffic, and got to look at some high volume usenet groups, of which 
that group was one.


Don't forget GNU/Linux,  & GNU/Hurd, plus many others...  :-)

I'll settle for avoiding unnecessary use of gender!


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Calculage avg. width when operator = is missing

2015-09-22 Thread Alvaro Herrera
Tom Lane wrote:

> Should we consider this HEAD-only, or a back-patchable bug fix?
> Or perhaps compromise on HEAD + 9.5?

It looks like a bug to me, but I think it might destabilize approved
execution plans(*), so it may not be such a great idea to back patch
branches that are already released.  I think HEAD + 9.5 is good.

(*) I hear there are even applications where queries and their approved
execution plans are kept in a manifest, and plans that deviate from that
raise all kinds of alarms.  I have never seen such a thing ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One question about security label command

2015-09-22 Thread Joe Conway
On 09/15/2015 11:36 AM, Joe Conway wrote:
> On 09/13/2015 10:29 AM, Kouhei Kaigai wrote:
>> The attached one is the regression test fixup in v9.2.
>> As we applied to the v9.3 or later, it replaces unconfined_t domain
>> by the self defined sepgsql_regtest_superuser_t.

> Thanks -- I'll look through this over the next day or two.

Took a bit longer than I thought but now pushed :-)

At this point all current branches should be passing the sepgsql
regression test. As a reminder, that would be:

  rhino: 9.2 through master
  agama: 9.1

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Calculage avg. width when operator = is missing

2015-09-22 Thread Alvaro Herrera
Shulgin, Oleksandr wrote:
> On Sep 22, 2015 8:58 PM, "Andrew Dunstan"  wrote:

> > Yes, "/revenons/ à /nos moutons/." You can set up text based comparison
> > ops fairly easily for json - you just need to be aware of the limitations.
> > See https://gist.github.com/adunstan/32ad224d7499d2603708
> 
> Yes, I've already tried this approach and have found that analyze
> performance degrades an order of magnitude due to sql-level function
> overhead and casts to text.  In my tests, from 200ms to 2000ms with btree
> ops on a default sample of 30,000 rows.

You should be able to create a C function json_cmp() that simply calls
bttextcmp() internally, and C functions for each operator using that
one, in the same way.

In any case I think your patch is a good starting point.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Calculage avg. width when operator = is missing

2015-09-22 Thread Tom Lane
Alvaro Herrera  writes:
> In any case I think your patch is a good starting point.

The comments seemed to need some wordsmithing, but I think this is
probably basically a good idea; we've had similar complaints before
about some other equality-less datatypes, such as point.

Should we consider this HEAD-only, or a back-patchable bug fix?
Or perhaps compromise on HEAD + 9.5?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Jim Nasby
A client was getting some hard to diagnose out of memory errors. What 
made this especially confusing was that there was no context reported at 
all, other than the (enormous) statement that triggered the error.


At first I thought the lack of context indicated a palloc had failed 
during ereport() (since we apparently just toss the previous error when 
that happens), but it turns out there's some error reporting in 
pg_stat_statements that's less than ideal. Attached patch fixes, though 
I'm not sure if %lld is portable or not.


I'll also argue that this is a bug and should be backpatched, but I'm 
not hell-bent on that.


At the same time I looked for other messages that don't explicitly 
reference pg_stat_statements; the only others are in 
pg_stat_statements_internal() complaining about being called in an 
inappropriate function context. Presumably at that point there's a 
reasonable error context stack so I didn't bother with them.


This still seems a bit fragile to me though. Anyone working in here has 
to notice that most every errmsg mentions pg_stat_statements and decide 
there's a good reason for that. ISTM it'd be better to push a new 
ErrorContextCallback onto the stack any time we enter the module. If 
folks think that's a good idea I'll pursue it as a separate patch.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 59b8a2e..06a0912 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1893,14 +1893,24 @@ qtext_load_file(Size *buffer_size)
 
/* Allocate buffer; beware that off_t might be wider than size_t */
if (stat.st_size <= MaxAllocSize)
-   buf = (char *) malloc(stat.st_size);
-   else
-   buf = NULL;
-   if (buf == NULL)
{
+   buf = (char *) malloc(stat.st_size);
+
+   if (buf == NULL)
+   {
+   ereport(LOG,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg("out of memory attempting to 
pg_stat_statement file"),
+errdetail("file \"%s\": size %lld", 
PGSS_TEXT_FILE, stat.st_size)));
+   CloseTransientFile(fd);
+   return NULL;
+   }
+   } else {
ereport(LOG,
+   /* Is there a better code to use? IE: SQLSTATE 
53000, 53400 or 54000 */
(errcode(ERRCODE_OUT_OF_MEMORY),
-errmsg("out of memory")));
+errmsg("pg_stat_statement file is too large to 
process"),
+errdetail("file \"%s\": size %lld", 
PGSS_TEXT_FILE, stat.st_size)));
CloseTransientFile(fd);
return NULL;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread David Steele

On 9/22/15 11:00 AM, Andrew Dunstan wrote:

On 09/22/2015 10:29 AM, Geoff Winkless wrote:

​ That's your opinion; my opinion remains otherwise. It's not
"perfectly reasonable" to abuse the plural because some 1960s
feminazis either misunderstood or didn't like the fact that (because
of history) in English the gender-neutral singular happens to also be
the male singular.


Wow, 1960s feminazis, eh? I originally thought you were just a narrow
minded, pedantic and antiquated grammarian. Now I realize that's the
least of your troubles. Please take your misogyny elsewhere. I hear the
Rabid Puppies have openings.


I think conversations like this are a part of why we have trouble 
attracting new contributors (of any gender) to the community.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-09-22 Thread Amir Rohan

>On 07/16/15, Robert Haas wrote:
>    
>>> * Developers will immediately understand the format
>>
>>I doubt it.  I think any format that we pick will have to be carefully
>>documented.  People may know what JSON looks like in general, but they
>>will not immediately know what bells and whistles are available in
>>this context.
>>
>>> * Easy to programmatically manipulate in a range of languages
>>
>> <...> I think it will be rare to need to parse the postgresql.conf string,
>> manipulate it programatically, and then put it back.
>
>On Sun, Jul 19, 2015 at 4:16 PM, Tom Lane  wrote:
>> Josh Berkus  writes:
>>> On 07/17/2015 04:36 PM, Jim Nasby wrote:
 I'm guessing it'd be really ugly/hard to support at least this GUC being
 multi-line?
>>
>>> Mind you, multi-line GUCs would be useful otherwise, but we don't want
>>> to hinge this feature on making that work.
>>
>> Do we really want such a global reduction in friendliness to make this
>> feature easier?
>
>Maybe shoehorning this into the GUC mechanism is the wrong thing, and
>what we really need is a new config file for this.  The information
>we're proposing to store seems complex enough to justify that.

 

It seems like:

1) There's a need to support structured data in configuration for future

needs as well, it isn't specific to this feature.
2) There should/must be a better way to validate configuration then
to restarting the server in search of syntax errors.

 

Creating a whole new configuration file for just one feature *and* in a different
format seems suboptimal.  What happens when the next 20 features need structured

config data, where does that go? will there be additional JSON config files *and* perhaps

new mini-language values in .conf as development continues?  How many dedicated

configuration files is too many?


Now, about JSON (Earlier Upthread):

 
On 07/01/15, Peter Eisentraut wrote:

> On 6/26/15 2:53 PM, Josh Berkus wrote:
> > I would also suggest that if I lose this battle and
> > we decide to go with a single stringy GUC, that we at least use JSON
> > instead of defining our out, proprietary, syntax?
>  
> Does JSON have a natural syntax for a set without order?

 

No. Nor Timestamps. It doesn't even distingush integer from float

(Though parsers do it for you in dynamic languages). It's all because

of its unsightly _javascript_ roots.
 


The current patch is now forced by JSON to conflate sets and lists, so

un/ordered semantics are no longer tied to type but to the specific configuration keys.

So, If a feature ever needs a key where the difference between set and list matters

and needs to support both, you'll need seperate keys (both with lists, but meaning different things)

or a separate "mode" key or something. Not terrible, just iffy.

 


Other have found JSON unsatisfactory before. For example, the clojure community

has made (at least) two attempts at alternatives, complete with the meh adoption

rates you'd expect despite being more capable formats:

 

http://blog.cognitect.com/blog/2014/7/22/transit
https://github.com/edn-format/edn

 

There's also YAML, TOML, etc', none as universal as JSON. But to reiterate, JSON itself

has Lackluster type support (no sets, no timestamps), is verbose, iseasy to malform when editing

(missed a curly brace, shouldn't use a single quote), isn't extensible, and my personal pet peeve

is that it doesn't allow non-string or bare-string keys in maps (a.k.a "death by double-quotes").
 

Python has the very natural {1,2,3} syntax for sets, but of course that's not part of JSON.

 

If  JSON wins out despite all this, one alternative not discussed is to extend

the .conf parser to accept json dicts as a fundamental type. e.g.:

 

###

data_directory = 'ConfigDir'   
port = 5432
work_mem = 4MB
hot_standby = off
client_min_messages = notice
log_error_verbosity = default
autovacuum_analyze_scale_factor = 0.1
synch_standby_config = {
  "sync_info": {
    "nodes": [
  {
    "priority": 1,
    "group": "cluster1"
  },
  "A"
    ],
    "quorum": 3
  },
  "groups": {
    "cluster1": [
  "B",
  "C"
    ]
  }
}

 

This *will* break someone's perl I would guess. Ironically, those scripts wouldn't have broken if

some structured format were in use for the configuration data when they were written...

`postgres --describe-config` is also pretty much tied to a line-oriented configuration.

 

Amir

 

p.s.

 

MIA configuration validation tool/switch should probably get a thread too.

 




Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2015-09-22 Thread David Rowley
On 20 August 2015 at 13:49, Tomas Vondra 
wrote:

> attached is a significantly reworked patch for using the foreign keys in
> selectivity estimation.
>

Thanks for working a new patch, I'm starting to look at it again now:

Here's my thoughts so far:

+ /*
+ * TODO Can we do something like (hasindex) here? Is it necessary? The
+ *  trouble with that is that we don't have a good place to reset that
+ *  flag (relhasindex is reset by vacuum, but is has nothing to do with
+ *  foreign keys at this point).
+ */
+ if (true)

You don't seem to have addressed this part yet.

I mentioned previously that I had attempted this already. Here was the
response
http://www.postgresql.org/message-id/18703.1410450...@sss.pgh.pa.us

Please just remove the comment and if (true).  By today's
standards relhasindex would never be added.
Does it not just serve as an optimization only when the rel is not cached
anyway?

--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -562,7 +562,6 @@ remove_rel_from_joinlist(List *joinlist, int relid, int
*nremoved)
  return result;
 }

-
 /*

Accidental change, please remove.


+static Selectivity
+clauselist_join_selectivity(PlannerInfo *root, List *joinquals, int varno,
+ JoinType jointype, SpecialJoinInfo *sjinfo)

varno is not used.

+ /*
+ * First we'll identify foreign keys that are fully matched by the join
+ * clauses, and we'll update the selectivity accordingly while doing so.
+ */
+ fkeys = find_satisfied_fkeys(root, sjinfo, joinquals, );
+
+ /*
+ * Now that we have the foreign keys, we can get rid of the clauses
+ * matching any of them, and only keep the remaining clauses, so that
+ * we can estimate them using the regular selectivity estimation.
+ */
+ unmatched = filter_fk_join_clauses(root, sjinfo, fkeys, joinquals);

This seems a bit convoluted and inefficient.
Why not just return a bitmask of the matched items indexed by the list
position?
Doing it this way you're having to match the expressions twice. Once seems
fine.
Did you read my review about looking for the longest matches by counting
the bits in the mask?

+ * Another slightly strange case is FK constraints in both directions
(these
+ * statements don't work - the foreign keys need to be established using
+ * ALTER, but for illustration it's sufficient).
+ *
+ * CREATE TABLE a (
+ * a1 INT,
+ * a2 INT,
+ * UNIQUE (a1, a2),
+ * FOREIGN KEY (a1, a2) REFERENCES a (b1, b2)
+ * );

Did you perhaps mean b? i.e. REFERENCES b (b1, b2) ?

Same is duplicated further down.

+ *sel *= (1.0 / rel_outer->tuples);

I think this needs to be :

+ *sel *= (1.0 / Max(rel_outer->tuples, 1.0));

As the following causes a divide by zero.

See attached divide_by_zero.sql

Basically causes this:  Hash Join  (cost=8.29..-1.#J rows=1 width=40)


+ * XXX This does exactly the same thing as the previous loop, so no
+ * comments.

It would be better if we could do something more how make_join_rel() does
things like:

add_paths_to_joinrel(root, joinrel, rel1, rel2,
JOIN_INNER, sjinfo,
restrictlist);
add_paths_to_joinrel(root, joinrel, rel2, rel1,
JOIN_INNER, sjinfo,
restrictlist);

Where the inputs to the function are just swapped.


+ outer = -1;
+ while ((outer = bms_next_member(sjinfo->min_lefthand, outer)) >= 0)

Why did you change this from ensuring we get a singleton member?
I'd imagine if more than 2 relations are required to make the join, then we
can't use foreign keys, as the other join may duplicate or eliminate tuples.
Perhaps you've thought of this more than I have. Would you be able to
explain?

+ * XXX Maybe we should estimate even the single-column keys here,
+ * as it's really cheap. But it can't do any cross-table check
+ * of MCV lists or whatever clauselist_selectivity() does.
+ */
+ if (fkinfo->nkeys < 2)
+ continue;

This should be removed. I see no reason at all to pass up likely perfect
estimates for histogram lookup estimates.


Overall, I really feel that you should just take the longest matching
foreign key. i.e the one that matches the most clauses in the joinquals,
and completely ignore any shorter matches, or partial matches.  Just that
alone is probably 99.% of the use cases that will benefit from this. No
use adding weird code that's only right half the time for the other 0.0001%
of use cases.

I imagine clauselist_join_selectivity should be more like:

int outerrelid, innerrelid;

if (bms_get_singleton_member(sjinfo->min_righthand, ) &&
   bms_get_singleton_member(sjinfo->min_lefthand, ))
{
   Bitmapset fkclauses1, fkclauses2;
   List *unmatched;
   Selectivity sel;
   RelOptInfo *innerrel = find_base_rel(root, innerrelid);
   RelOptInfo *outerrel = find_base_rel(root, outerrelid);

   fkclauses1 = find_foreign_key_clauses(root, outerrel, innerrel,
joinquals);
   fkclauses2 = find_foreign_key_clauses(root, innerrel, outerrel,
joinquals);

  if (fkclauses1 || fkclauses2)
  {
 if 

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread David Rowley
On 23 September 2015 at 10:16, Jim Nasby  wrote:

>
>
Attached patch fixes, though I'm not sure if %lld is portable or not.
>
>
I think you could probably use INT64_FORMAT, and cast the stat.st_size to
int64 too.

There's an example in FileRead() in fd.c

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 7:45 PM, Andres Freund  wrote:
> On 2015-09-23 01:24:31 +0200, Andres Freund wrote:
>> I think we put at least three layers on bandaid on this issue since
>> 9.3.2, and each layer made things more complicated.
>
> 2a9b01928f193f529b885ac577051c4fd00bd427 - Cope with possible failure of the 
> oldest MultiXact to exist.
> 5bbac7ec1b5754043e073a45454e4c257512ce30 - Advance the stop point for 
> multixact offset creation only at checkpoint.
> 9a28c3752c89ec01fb8b28bb5904c6d547507fda - Have multixact be truncated by 
> checkpoint, not vacuum
> 215ac4ad6589e0f6a31cc4cd867aedba3cd42924 - Truncate pg_multixact/'s contents 
> during crash recovery
>
> At least these are closely related to the fact that truncation isn't WAL
> logged. There are more that are tangentially related. We (primarily me,
> writing the timewise first one) should have gone for a new WAL record
> from the start. We've discussed that in at least three of the threads
> around the above commits...

I'm not disagreeing with any of that.  I'm just disagreeing with you
on the likelihood that we're going to make things better vs. making
them worse.  But, really, I've said everything I have to say about
this.  You have a commit bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Andres Freund
On 2015-09-22 20:14:11 -0400, Robert Haas wrote:
> I'm not disagreeing with any of that.  I'm just disagreeing with you
> on the likelihood that we're going to make things better vs. making
> them worse.  But, really, I've said everything I have to say about
> this.  You have a commit bit.

I'm not going to push backpatch this to 9.3/9.4 without you being on
board. For that I think you're unfortunately too often right, and this
is too critical. But I'm also not going to develop an alternative
stopgap for those versions, since I have no clue how that'd end up being
better.

The only alternative proposal I have right now is to push this to
9.5/9.6 (squashed with a followup patch removing legacy truncations) and
then push the patch including legacy stuff to 9.3/4 after the next set
of releases.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Tom Lane
David Rowley  writes:
> On 23 September 2015 at 10:16, Jim Nasby  wrote:
>>> Attached patch fixes, though I'm not sure if %lld is portable or not.

It is not.

> I think you could probably use INT64_FORMAT,

Not in a message you expect to be translatable.

There are ways around that, but TBH I do not think that including the file
size in the errdetail is valuable enough to be worth the trouble.  I'd
just leave it out.  "insufficient memory to load statement file" seems
quite enough.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Andres Freund
On 2015-09-23 01:24:31 +0200, Andres Freund wrote:
> I think we put at least three layers on bandaid on this issue since
> 9.3.2, and each layer made things more complicated.

2a9b01928f193f529b885ac577051c4fd00bd427 - Cope with possible failure of the 
oldest MultiXact to exist.
5bbac7ec1b5754043e073a45454e4c257512ce30 - Advance the stop point for multixact 
offset creation only at checkpoint.
9a28c3752c89ec01fb8b28bb5904c6d547507fda - Have multixact be truncated by 
checkpoint, not vacuum
215ac4ad6589e0f6a31cc4cd867aedba3cd42924 - Truncate pg_multixact/'s contents 
during crash recovery

At least these are closely related to the fact that truncation isn't WAL
logged. There are more that are tangentially related. We (primarily me,
writing the timewise first one) should have gone for a new WAL record
from the start. We've discussed that in at least three of the threads
around the above commits...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Peter Geoghegan
On Tue, Sep 22, 2015 at 3:16 PM, Jim Nasby  wrote:
> At first I thought the lack of context indicated a palloc had failed during
> ereport() (since we apparently just toss the previous error when that
> happens), but it turns out there's some error reporting in
> pg_stat_statements that's less than ideal. Attached patch fixes, though I'm
> not sure if %lld is portable or not.

+ ereport(LOG,
+  (errcode(ERRCODE_OUT_OF_MEMORY),
+   errmsg("out of memory attempting to pg_stat_statement file"),
+   errdetail("file \"%s\": size %lld", PGSS_TEXT_FILE,
stat.st_size)));

Uh, what?

I'm not opposed to this basic idea, but I think the message should be
reworded, and that the presence of two separate ereport() call sites
like the above is totally unnecessary. The existing MaxAllocSize check
is just defensive; no user-visible distinction needs to be made.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Andres Freund
On 2015-09-22 14:52:49 -0400, Robert Haas wrote:
> 1. It would be possible to write a patch that included ONLY the
> changes needed to make that happen, and did nothing else.  It would be
> largely a subset of this.  If we want to change 9.3 and 9.4, I
> recommend we do that first, and then come back to the rest of this.

I think that patch would be pretty much what I wrote.

To be correct you basically have to:
1) Never skip a truncation on the standby. Otherwise there might have
   already have been wraparound and you read the completely wrong
   offset.
2) Always perform truncations on the standby exactly the same moment (in
   the replay stream) as on the primary. Otherwise there also can be a
   wraparound.
3) Never read anything from an SLRU from the data directory while
   inconsistent. In an inconsistent state we can read completely wrong
   data. A standby can be inconsistent in many situations, including
   crashes, restarts and fresh base backups.

To me these three together leave only the option to never read an SLRUs
contents on a standby.  That only leaves minor changes in the patch that
could be removed afaics. I mean we could leave in
DetermineSafeOldestOffset() but it'd be doing pretty much the same as
SetOffsetVacuumLimit().


I think we put at least three layers on bandaid on this issue since
9.3.2, and each layer made things more complicated. We primarily did so
because of the compatibility and complexity concerns. I think that was a
bad mistake. We should have done it mostly right back then, and we'd be
better of now. If we continue with bandaids on the back branches while
having a fixed 9.5+ with significantly different behaviour we'll have a
hellish time fixing things in the back branches. And introduce more bugs
than this might introduce.


> 2. I agree that what we're doing right now is wrong.  And I agree that
> this fixes a real problem. But it seems to me to be quite possible,
> even likely, that it will create other problems.

Possible. But I think those bugs will be just bugs and not more
fundamental architectural problems.

To be very clear. I'm scared of the idea of backpatching this. I'm more
scared of doing that myself. But even more I am scared of the current
state.


> For example, suppose that there are files in the data directory that
> precede oldestMultiXact. In the current approach, we'll remove those
> because they're not in the range we expect to be used.

Hm. For offsets/ we continue to use SimpleLruTruncate() for truncation,
which scans the directory, so I don't see a problem. For members/ we
won't - but neither do we really today, see
SlruScanDirCbRemoveMembers(). So I don't think there'll be a significant
difference?


> a leftover old file that doesn't get removed the first time through -
> for whatever reason - becomes a time bomb that will explode on the
> next wraparound.  I don't know that that will happen.

We should be able to deal with that, otherwise recovery is pretty
borked. It can be a problem for the 'recovery from wrong oldest multi'
case, but that's the same today.


> I will bet you a beer that there are other possible hazards neither of
> us is foreseeing right now.

Right. I'm not dismissing that. I just think it's much more likely to be
handleable problems than the set we have today. It's incredibly hard to
get an accurate mental model of the combined behaviour & state of
primary and standby today. Even if we three have that today, I'm pretty
sure we won't in half a year. And sure as hell nearly nobody else will
have one.


> >> - If SlruDeleteSegment fails in unlink(), shouldn't we at the very
> >> least log a message?  If that file is still there when we loop back
> >> around, it's going to cause a failure, I think.
> >
> > The existing unlink() call doesn't, that's the only reason I didn't add
> > a message there. I'm fine with adding a (LOG or WARNING?) message.
> 
> Cool.

Hm. When redoing a truncation during [crash] recovery that can cause a
host of spurious warnings if already done before. DEBUG1 to avoid
scaring users?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 3:21 PM, Amit Kapila  wrote:
> Attached patch (read_funcs_v1.patch) contains support for all the plan
> and other nodes (like SubPlan which could be required for worker) except
> CustomScan node.

It looks like you need to update the top-of-file comment for outfuncs.c.

Doesn't _readCommonPlan() leak?  I think we should avoid that.
_readCommonScan() and _readJoin() are worse: they leak multiple
objects.  It should be simple enough to avoid this: just have your
helper function take a Plan * as argument and then use
READ_TEMP_LOCALS() rather than READ_LOCALS().  Then the caller can use
READ_LOCALS, call the helper to fill in all the Plan fields, and then
read the other stuff itself.

Instead of passing the Plan down by casting, how about passing
_node->plan?  And similarly for scans and joins.

readAttrNumberCols uses sizeof(Oid) instead of sizeof(AttrNumber).

I still don't understand why we need to handle PlanInvalItem.
Actually, come to think of it, I'm not sure we need PlannedStmt
either.  Let's leave those out; they seem like trouble.

I think it would be worth doing something like this:

#define READ_ATTRNUMBER_ARRAY(fldname, len) \
token = pg_strtok(); \
local_node->fldname = readAttrNumberCols(len);

And similarly for READ_OID_ARRAY, READ_BOOL_ARRAY, READ_INT_ARRAY.

In general these routines are in the same order as plannodes.h, which
is good. But _readNestLoopParam is out of place.  Can we move it just
after _readNestLoop?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Jim Nasby

On 9/22/15 5:58 PM, Peter Geoghegan wrote:

On Tue, Sep 22, 2015 at 3:16 PM, Jim Nasby  wrote:

At first I thought the lack of context indicated a palloc had failed during
ereport() (since we apparently just toss the previous error when that
happens), but it turns out there's some error reporting in
pg_stat_statements that's less than ideal. Attached patch fixes, though I'm
not sure if %lld is portable or not.


+ ereport(LOG,
+  (errcode(ERRCODE_OUT_OF_MEMORY),
+   errmsg("out of memory attempting to pg_stat_statement file"),
+   errdetail("file \"%s\": size %lld", PGSS_TEXT_FILE,
stat.st_size)));

Uh, what?


Oops. I'll fix that and address David's concern tomorrow.


I'm not opposed to this basic idea, but I think the message should be
reworded, and that the presence of two separate ereport() call sites
like the above is totally unnecessary. The existing MaxAllocSize check
is just defensive; no user-visible distinction needs to be made.


I disagree. If you're running this on a 200+GB machine with plenty of 
free memory and get that error you're going to be scratching your head. 
I can see an argument for using the OOM SQLSTATE, but treating an 
artificial limit the same as a system error seems pretty bogus.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Tom Lane
Peter Geoghegan  writes:
> I'm not opposed to this basic idea, but I think the message should be
> reworded, and that the presence of two separate ereport() call sites
> like the above is totally unnecessary. The existing MaxAllocSize check
> is just defensive; no user-visible distinction needs to be made.

I wonder whether the real problem here is failure to truncate statement
texts to something sane.  Do we really need to record the whole text of
multi-megabyte statements?  Especially if doing so could render the entire
feature nonfunctional?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-22 Thread Peter Geoghegan
On Tue, Sep 22, 2015 at 4:40 PM, Tom Lane  wrote:
> I wonder whether the real problem here is failure to truncate statement
> texts to something sane.  Do we really need to record the whole text of
> multi-megabyte statements?  Especially if doing so could render the entire
> feature nonfunctional?

I recently encountered a 9.4 customer database that had an insanely
large query text stored by pg_stat_statements, apparently created as
part of a process of kicking the tires of their new installation. I
don't know how large it actually was, but it caused psql to stall for
over 10 seconds. Insane queries happen, so truncating query text could
conceal the extent of how unreasonable a query is.

I think that the real problem here is that garbage collection needs to
deal with OOM more appropriately. That's the only way there could be a
problem with an in-flight query as opposed to a query that looks at
pg_stat_statements, which seems to be Nasby's complaint.

My guess is that this very large query involved a very large number of
constants, possibly contained inside an " IN ( )". Slight variants of
the same query, that a human would probably consider to be equivalent
have caused artificial pressure on garbage collection.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 11:00 AM, Andrew Dunstan  wrote:
>> Happily for me, I can continue to write documents in a grammatically
>> correct way, and no-one will read them and think I'm a grammar-nazi (or
>> obstinate, or old-fashioned or whatever) because unless they're specifically
>> looking for it no-one will notice that I'm avoiding the contentious usage
>> altogether. On the other hand, there _will_ be a (perhaps significant)
>> proportion of people who read your documents and think that you're incapable
>> of writing a grammatically correct sentence.
>
> Wow, 1960s feminazis, eh? I originally thought you were just a narrow
> minded, pedantic and antiquated grammarian. Now I realize that's the least
> of your troubles. Please take your misogyny elsewhere. I hear the Rabid
> Puppies have openings.

I think this discussion could benefit from a little more light and a
lot less heat.

Here's my contribution: the use of they does sometimes seek awkward.
However, it's not remotely new:

https://stroppyeditor.wordpress.com/2015/04/21/everything-you-ever-wanted-to-know-about-singular-they/
https://en.wikipedia.org/wiki/Singular_they#Older_usage_by_respected_authors
http://englishbibles.blogspot.com/2006/09/singular-they-in-english-bibles.html

And I do think it's generally worthwhile to avoid the use of "he"
where possible.  Would I have done it exactly the way that Peter did
it here?  Uh, no.  Is it better than not doing anything?  In my
opinion, yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving test coverage of extensions with pg_dump

2015-09-22 Thread Alvaro Herrera
Euler Taveira wrote:
> On 17-09-2015 14:21, Michael Paquier wrote:
> >pg_dump relies on attnum to define the column ordering, so one
> >possibility would be to do things more consistently at backend level.

We discussed this in some other thread, not long ago.  I looked briefly
in the archives but couldn't find it.  I think the conclusion was
something along the lines of "hmm, tough".

> Someone can say that we could assign an attnum for column "d" considering
> all of the inheritance tree. However, attnum is used as an index to arrays
> (we could bloat some of those) and some logic rely on it to count the number
> of columns. It would become tablecmds.c into an spaghetti.

We don't need any more spaghetti there, thanks!

> IMHO a possible way to solve it is adding support for logical column
> ordering. An ALTER TABLE command (emitted if a parameter was informed)
> during dump could handle it. BTW, last thread [1] about logical column
> ordering seems to have died a few months ago. Alvaro?

Tomas Vondra also worked a bit on this patch, and we eventually gave up
on it due to lack of time.  We might be able to get back on it someday,
but do not hold your breath.  If you want the current bug fixed, do not
wait for logical column numbering.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Calculage avg. width when operator = is missing

2015-09-22 Thread Shulgin, Oleksandr
Hi Hackers,

I've recently stumbled upon a problem with table bloat estimation in case
there are columns of type JSON.

The quick bloat estimation queries use sum over pg_statistic.stawidth of
table's columns, but in case of JSON the corresponding entry is never
created by the ANALYZE command due to equality comparison operator
missing.  I understand why there is no such operator defined for this
particular type, but shouldn't we still try to produce meaningful average
width estimation?

In my case the actual bloat is around 40% as verified with pgstattuple,
while the bloat reported by quick estimate can be between 75% and 95%(!) in
three instances of this problem.  We're talking about some hundreds of GB
of miscalculation.

Attached patch against master makes the std_typanalyze still try to compute
the minimal stats even if there is no "=" operator.  Makes sense?

I could also find this report in archives that talks about similar problem,
but due to all values being over the analyze threshold:

http://www.postgresql.org/message-id/flat/12480.1389370...@sss.pgh.pa.us#12480.1389370...@sss.pgh.pa.us

I think we could try harder, otherwise any estimate relying on average
width can be way off in such cases.

--
Alex
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
new file mode 100644
index 861048f..903681e
*** a/src/backend/commands/analyze.c
--- b/src/backend/commands/analyze.c
*** std_typanalyze(VacAttrStats *stats)
*** 1723,1732 
  			 , , NULL,
  			 NULL);
  
- 	/* If column has no "=" operator, we can't do much of anything */
- 	if (!OidIsValid(eqopr))
- 		return false;
- 
  	/* Save the operator info for compute_stats routines */
  	mystats = (StdAnalyzeData *) palloc(sizeof(StdAnalyzeData));
  	mystats->eqopr = eqopr;
--- 1723,1728 
*** std_typanalyze(VacAttrStats *stats)
*** 1737,1743 
  	/*
  	 * Determine which standard statistics algorithm to use
  	 */
! 	if (OidIsValid(ltopr))
  	{
  		/* Seems to be a scalar datatype */
  		stats->compute_stats = compute_scalar_stats;
--- 1733,1739 
  	/*
  	 * Determine which standard statistics algorithm to use
  	 */
! 	if (OidIsValid(eqopr) && OidIsValid(ltopr))
  	{
  		/* Seems to be a scalar datatype */
  		stats->compute_stats = compute_scalar_stats;
*** std_typanalyze(VacAttrStats *stats)
*** 1776,1786 
  /*
   *	compute_minimal_stats() -- compute minimal column statistics
   *
!  *	We use this when we can find only an "=" operator for the datatype.
   *
   *	We determine the fraction of non-null rows, the average width, the
   *	most common values, and the (estimated) number of distinct values.
   *
   *	The most common values are determined by brute force: we keep a list
   *	of previously seen values, ordered by number of times seen, as we scan
   *	the samples.  A newly seen value is inserted just after the last
--- 1772,1784 
  /*
   *	compute_minimal_stats() -- compute minimal column statistics
   *
!  *	We use this when we cannot find "=" and "<" operators for the datatype.
   *
   *	We determine the fraction of non-null rows, the average width, the
   *	most common values, and the (estimated) number of distinct values.
   *
+  *	Only if there is a "=" operator, we try to find the most common values.
+  *
   *	The most common values are determined by brute force: we keep a list
   *	of previously seen values, ordered by number of times seen, as we scan
   *	the samples.  A newly seen value is inserted just after the last
*** compute_minimal_stats(VacAttrStatsP stat
*** 1809,1830 
  		Datum		value;
  		int			count;
  	} TrackItem;
! 	TrackItem  *track;
! 	int			track_cnt,
! track_max;
  	int			num_mcv = stats->attr->attstattarget;
  	StdAnalyzeData *mystats = (StdAnalyzeData *) stats->extra_data;
  
! 	/*
! 	 * We track up to 2*n values for an n-element MCV list; but at least 10
! 	 */
! 	track_max = 2 * num_mcv;
! 	if (track_max < 10)
! 		track_max = 10;
! 	track = (TrackItem *) palloc(track_max * sizeof(TrackItem));
! 	track_cnt = 0;
  
! 	fmgr_info(mystats->eqfunc, _cmpeq);
  
  	for (i = 0; i < samplerows; i++)
  	{
--- 1807,1830 
  		Datum		value;
  		int			count;
  	} TrackItem;
! 	TrackItem  *track = NULL;
! 	int			track_cnt = 0,
! track_max = 0;
  	int			num_mcv = stats->attr->attstattarget;
  	StdAnalyzeData *mystats = (StdAnalyzeData *) stats->extra_data;
  
! 	if (OidIsValid(mystats->eqfunc))
! 	{
! 		/*
! 		 * We track up to 2*n values for an n-element MCV list; but at least 10
! 		 */
! 		track_max = 2 * num_mcv;
! 		if (track_max < 10)
! 			track_max = 10;
! 		track = (TrackItem *) palloc(track_max * sizeof(TrackItem));
  
! 		fmgr_info(mystats->eqfunc, _cmpeq);
! 	}
  
  	for (i = 0; i < samplerows; i++)
  	{
*** compute_minimal_stats(VacAttrStatsP stat
*** 1876,1926 
  			total_width += strlen(DatumGetCString(value)) + 1;
  		}
  
! 		/*
! 		 * See if the value matches anything we're already 

Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Geoff Winkless
On 22 September 2015 at 09:28, Albe Laurenz  wrote:

> Peter Geoghegan wrote:
> > On Mon, Sep 21, 2015 at 9:32 PM, Erik Rijkers  wrote:
> >> I think this compulsive 'he'-avoiding is making the text worse.
> >>
> >>
> >> -  environment variable); any user can make such a change for his
> session.
> >> +  environment variable); any user can make such a change for their
> session.
> >
> > -1. It seems fine to me.
>
> (Disclaimer: I am not a native speaker.)
>
> Using the pronoun of the third person plural as a replacement for "his or
> her"
> has become widely used, at least in the U.S., and the OED condones that
> use:
> http://www.oxforddictionaries.com/definition/english/they
>
>
Without wanting to get into a grammar war, ​I'm not so sure I agree that it
"condones" it. Dictionaries reflect the current state of usage, they don't
act as arbiters of correctness. The abuse of "literally" as an emphasiser
(which usage is now listed in the OED) is a prime example.

As an Englishman ​I would prefer "his or her" over "their". Perhaps our
American cousins might disagree though.

WRT the second, it probably doesn't help that "might not be the same as the
database user that is to be connect as" is incorrect anyway - it should
perhaps be "that is to be connect*ed *as" (although I still find the
construction clumsy).

Geoff


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2015-09-22 Thread Ildus Kurbangaliev
On Tue, 15 Sep 2015 14:39:51 -0400
Robert Haas  wrote:

> On Tue, Sep 15, 2015 at 12:44 PM, Ildus Kurbangaliev
>  wrote:
> > On Mon, 14 Sep 2015 06:32:22 -0400
> > Robert Haas  wrote:
> >
> >> On Sun, Sep 13, 2015 at 5:05 PM, Ildus Kurbangaliev
> >>  wrote:
> >> > Yes, that is because I tried to go with current convention
> >> > working with shmem in Postgres (there are one function that
> >> > returns the size and others that initialize that memory). But I
> >> > like your suggestion about API functions, in that case number of
> >> > tranches and locks will be known during the initialization.
> >>
> >> I also want to leave the door open to tranches that are registered
> >> after initialization.  At that point, it's too late to put a
> >> tranche in shared memory, but you can still use DSM.
> >
> > We can hold some extra space in LWLockTrancheArray, add some
> > function for unregistering a tranche, and reuse free items in
> > LWLockTrancheId later.
> 
> We could, but since that would be strictly more annoying and less
> flexible than what we've already got, why would we?
> 

Yes, probably.
I'm going to change API calls as you suggested earlier.
How you do think the tranches registration after initialization should
look like?


Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Albe Laurenz
Peter Geoghegan wrote:
> On Mon, Sep 21, 2015 at 9:32 PM, Erik Rijkers  wrote:
>> I think this compulsive 'he'-avoiding is making the text worse.
>>
>>
>> -  environment variable); any user can make such a change for his 
>> session.
>> +  environment variable); any user can make such a change for their 
>> session.
> 
> -1. It seems fine to me.

(Disclaimer: I am not a native speaker.)

Using the pronoun of the third person plural as a replacement for "his or her"
has become widely used, at least in the U.S., and the OED condones that use:
http://www.oxforddictionaries.com/definition/english/they

Do we want to have that everywhere?

Yours,
Laurenz Albe 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LW_SHARED_MASK macro

2015-09-22 Thread Andres Freund
On 2015-09-21 21:36:15 +0200, Andres Freund wrote:
> On 2015-09-21 22:34:46 +0300, Alexander Korotkov wrote:
> > Great. BTW, are you going to commit this?
> 
> Yes, will do so tomorrow.

And done. Thanks for noticing.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Geoff Winkless
On 22 September 2015 at 14:09, Andrew Dunstan  wrote:

> You are fighting a losing battle. Think of they/them/their/theirs as being
> indefinitely gendered third person singular pronouns, as well as being
> third person plural pronouns. Yes it's a relatively new usage, but I don't
> think its at all unreasonable (speaking as someone who has been known to
> dislike some new usages and neologisms). It's not at all sloppy. On the
> contrary, it's quite deliberate. It's just not quite traditional.
>

​​It _is_ sloppy. It says "I can't be bothered to write a sentence that's
grammatically correct".
​

> You need to get over that.
>

I don't need to get over anything. If someone sends me a document that uses
"their" in a singular usage, I will think that person is lazy. That will
continue to be the case, whether people tell me that it's accepted usage or
not.

In much the same way, I know that I can safely discount the opinion of
anyone who uses "literally" to mean anything other than "literally"
 (
​similarly anyone who uses
"like" as a quotative)
​, even though both of those things are now in fairly common usage.​
​


> Your proposed style would make writing docs a lot harder,
>

I don't buy that at all. It takes a couple of seconds, if that, to come up
with something.


> forcing us to avoid use of the singular in cases where it is quite
> natural.
>

​Better than using the plural in the singular case.
​

> I'm strongly opposed to such a style rule.
>

​Meh. I don't really care how it's written, certainly not enough to make a
stand about it. I'd rather you guys concentrate on writing the brilliant
software than wasting time on stuff like this. I only replied because the
conversation popped up in my inbox and it seemed to be something on which
opinions were requested.

Geoff


[HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-22 Thread Thom Brown
Hi,

With 9.5 alpha 2 on Windows 8 (64-bit), trying to require SSL results
in a blocking error:

pg_hba.conf:
hostssl   postgres   postgres   0.0.0.0/0   trust

postgresql.conf:
ssl=on


C:\Program Files\PostgreSQL\9.5\bin>SET PGSSLMODE=require

C:\Program Files\PostgreSQL\9.5\bin>psql.exe -p 5432 -U postgres -h
localhost postgres
psql: SSL SYSCALL error: Operation would block (0x2733/10035)



Doing the same thing on Linux doesn't result in the error (I'm using
latest Git master here):

thom@swift:~/Development/data$ PGSSLMODE=require psql -U thom -p 5488
-d postgres -h 127.0.0.1
psql (9.6devel)
SSL connection (protocol: TLSv1.2, cipher:
ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=#

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obsolete use of volatile in walsender.c, walreceiver.c, walreceiverfuncs.c?

2015-09-22 Thread Andres Freund
Hi,

On 2015-09-17 16:32:17 +1200, Thomas Munro wrote:
> In walsender.c, walreceiver.c, walreceiverfuncs.c there are several
> places where volatile qualifiers are used apparently only to prevent
> reordering around spinlock operations.  My understanding is that if
> potential load/store reordering around spinlock operations is the only
> reason for using volatile, 0709b7ee72e4bc71ad07b7120acd117265ab51d0
> made it unnecessary.  For example see
> 6ba4ecbf477e0b25dd7bde1b0c4e07fc2da19348 which stripped some volatile
> qualifiers out of xlog.c.

Same in bufmgr.c et al. There it's actually rather annoying for new code
because volatile needs to be casted away in a bunch of places...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Gavin Flower

On 22/09/15 21:33, Geoff Winkless wrote:
On 22 September 2015 at 09:28, Albe Laurenz >wrote:


Peter Geoghegan wrote:
> On Mon, Sep 21, 2015 at 9:32 PM, Erik Rijkers > wrote:
>> I think this compulsive 'he'-avoiding is making the text worse.
>>
>>
>> -  environment variable); any user can make such a change
for his session.
>> +  environment variable); any user can make such a change
for their session.
>
> -1. It seems fine to me.

(Disclaimer: I am not a native speaker.)

Using the pronoun of the third person plural as a replacement for
"his or her"
has become widely used, at least in the U.S., and the OED condones
that use:
http://www.oxforddictionaries.com/definition/english/they


Without wanting to get into a grammar war, ​I'm not so sure I agree 
that it "condones" it. Dictionaries reflect the current state of 
usage, they don't act as arbiters of correctness. The abuse of 
"literally" as an emphasiser (which usage is now listed in the OED) is 
a prime example.


As an Englishman ​I would prefer "his or her" over "their". Perhaps 
our American cousins might disagree though.


WRT the second, it probably doesn't help that "might not be the same 
as the database user that is to be connect as" is incorrect anyway - 
it should perhaps be "that is to be connect*ed *as" (although I still 
find the construction clumsy).


Geoff

I am an Englishman.

I prefer "their" rather than "his or her", it is less clumsy & there is 
no point in specifying gender unless it is relevant!


Besides, some people are neither, or their biological gender is 
ambiguous - so a few people fit into neither the male nor the female 
category (depending on precise definitions, about 0.5%)!



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Bill Moran
On Tue, 22 Sep 2015 11:17:54 +0100
Geoff Winkless  wrote:

> On 22 September 2015 at 10:52, Gavin Flower 
> wrote:
> 
> > On 22/09/15 21:33, Geoff Winkless wrote:
> >> ??
> >> Without wanting to get into a grammar war, ?I'm not so sure I agree that
> >> it "condones" it. Dictionaries reflect the current state of usage, they
> >> don't act as arbiters of correctness. The abuse of "literally" as an
> >> emphasiser (which usage is now listed in the OED) is a prime example.
> >>
> >> I would prefer "his or her" over "their". Perhaps our American cousins
> >> might disagree though.
> >>
> >> I prefer "their" rather than "his or her", it is less clumsy & there is
> > no point in specifying gender unless it is relevant!
> 
> I agree in that I prefer "their" in informal speech; however in a formal
> document I would find it sloppy.? I don't think "his or her" is inherently
> clumsy; m
> aybe I'm just showing my age.?
> 
> Besides, some people are neither, or their biological gender is ambiguous -
> > so a few people fit into neither the male nor the female category
> > (depending on precise definitions, about 0.5%)!
> 
> My understanding is that most intersex (and certainly all trans) people
> would identify with one or the other, and even those who don't select
> exclusively identify with a mix of both (and would therefore still be
> covered by "his or her", no?) although I don't pretend to be an expert.
> 
> Perhaps it would be easier to avoid the controversy by actually rewording
> into the plural, where possible?
> 
> So
> 
> "any user can make such a change for his session."
> 
> becomes
> 
> "Users can make such a change for their individual sessions"
> 
> or similar?

+1

As an American/native English speaker, I find the use of the plural pronoun
in combination with a singular noun (which is not grammatically correct)
in formal writing to be sloppy and jarring to read. The change you suggest
above reads more professionally while still avoiding offending anyone.

-- 
Bill Moran


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Geoff Winkless
On 22 September 2015 at 10:52, Gavin Flower 
wrote:

> On 22/09/15 21:33, Geoff Winkless wrote:
>
>> ​​
>> Without wanting to get into a grammar war, ​I'm not so sure I agree that
>> it "condones" it. Dictionaries reflect the current state of usage, they
>> don't act as arbiters of correctness. The abuse of "literally" as an
>> emphasiser (which usage is now listed in the OED) is a prime example.
>>
>> I would prefer "his or her" over "their". Perhaps our American cousins
>> might disagree though.
>>
>> I prefer "their" rather than "his or her", it is less clumsy & there is
> no point in specifying gender unless it is relevant!
>

I agree in that I prefer "their" in informal speech; however in a formal
document I would find it sloppy.​ I don't think "his or her" is inherently
clumsy; m
aybe I'm just showing my age.​

Besides, some people are neither, or their biological gender is ambiguous -
> so a few people fit into neither the male nor the female category
> (depending on precise definitions, about 0.5%)!
>

My understanding is that most intersex (and certainly all trans) people
would identify with one or the other, and even those who don't select
exclusively identify with a mix of both (and would therefore still be
covered by "his or her", no?) although I don't pretend to be an expert.

Perhaps it would be easier to avoid the controversy by actually rewording
into the plural, where possible?

So

"any user can make such a change for his session."

becomes

"Users can make such a change for their individual sessions"

or similar?

Geoff


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Wolfgang Wilhelm
Hello all,

I'm a non-native speaker and to my shame not very good in english at all. So 
just my 2c on the topic:

"Users can make such a change for their individual sessions" is for me 
perfectly understandable.
"any user can make such a change for their session" is for me a mixture of a 
singular and a plural form and difficult to understand whether this is just a 
spelling error (is it individual sessions?) or just another gap in my 
knowledge. From my point of view the documentation is not just for english 
native speakers but for all the ones which don't have a translation to their 
language and as such one shouldn't look at the newest trends in "correctness".

RegardsWolfgang



 Geoff Winkless  schrieb am 12:18 Dienstag, 
22.September 2015:
   

 On 22 September 2015 at 10:52, Gavin Flower  
wrote:

On 22/09/15 21:33, Geoff Winkless wrote:

​​Without wanting to get into a grammar war, ​I'm not so sure I agree that it 
"condones" it. Dictionaries reflect the current state of usage, they don't act 
as arbiters of correctness. The abuse of "literally" as an emphasiser (which 
usage is now listed in the OED) is a prime example.

I would prefer "his or her" over "their". Perhaps our American cousins might 
disagree though.




I prefer "their" rather than "his or her", it is less clumsy & there is no 
point in specifying gender unless it is relevant!


I agree in that I prefer "their" in informal speech; however in a formal 
document I would find it sloppy.​ I don't think "his or her" is inherently 
clumsy; maybe I'm just showing my age.​

Besides, some people are neither, or their biological gender is ambiguous - so 
a few people fit into neither the male nor the female category (depending on 
precise definitions, about 0.5%)!


My understanding is that most intersex (and certainly all trans) people would 
identify with one or the other, and even those who don't select exclusively 
identify with a mix of both (and would therefore still be covered by "his or 
her", no?) although I don't pretend to be an expert.
Perhaps it would be easier to avoid the controversy by actually rewording into 
the plural, where possible?
So
"any user can make such a change for his session."

becomes
"Users can make such a change for their individual sessions"
or similar?
Geoff

  

Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Gavin Flower

On 22/09/15 22:17, Geoff Winkless wrote:
On 22 September 2015 at 10:52, Gavin Flower 
>wrote:


On 22/09/15 21:33, Geoff Winkless wrote:

​ ​
Without wanting to get into a grammar war, ​I'm not so sure I
agree that it "condones" it. Dictionaries reflect the current
state of usage, they don't act as arbiters of correctness. The
abuse of "literally" as an emphasiser (which usage is now
listed in the OED) is a prime example.

I would prefer "his or her" over "their". Perhaps our American
cousins might disagree though.

I prefer "their" rather than "his or her", it is less clumsy &
there is no point in specifying gender unless it is relevant!


I agree in that I prefer "their" in informal speech; however in a 
formal document I would find it sloppy.​ I don't think "his or her" is 
inherently clumsy; m

aybe I'm just showing my age.​

Besides, some people are neither, or their biological gender is
ambiguous - so a few people fit into neither the male nor the
female category (depending on precise definitions, about 0.5%)!


My understanding is that most intersex (and certainly all trans) 
people would identify with one or the other, and even those who don't 
select exclusively identify with a mix of both (and would therefore 
still be covered by "his or her", no?) although I don't pretend to be 
an expert.


Perhaps it would be easier to avoid the controversy by actually 
rewording into the plural, where possible?


So

"any user can make such a change for his session."

becomes

"Users can make such a change for their individual sessions"

or similar?

Geoff
To me, the key things is NOT to specify gender, unless it is relevant - 
and I don't think gender is relevant in describing how to use a database.


I was using "Gender Appropriate" language long before the Politically 
Correct craze started (over 50 years ago)!  I was told references to 
"he" in rules included females, which I thought was daft!



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 10:29 AM, Geoff Winkless wrote:
Oh, good! We're actually going to have this argument? Even though I 
said I don't care what you do?


On 22 September 2015 at 15:11, Andrew Dunstan > wrote:


On 09/22/2015 09:25 AM, Geoff Winkless wrote:

 If someone sends me a document that uses "their" in a
singular usage, I will think that person is lazy. That will
continue to be the case, whether people tell me that it's
accepted usage or not.


You can think that if you like,


​Thanks!​ That's a great relief to me, as I'm sure you can imagine.

but it's not even remotely true. 



​ You just stated that the reason you don't want to use the plural 
form I suggested is because it's too hard/time-consuming. That does 
suggest you accept that it's a valid solution but you're too lazy to 
use it.​


It's a deliberate choice to use a new, perfectly reasonable and
now widely accepted style of which you disapprove, but it's not lazy.


​ That's your opinion; my opinion remains otherwise. It's not 
"perfectly reasonable" to abuse the plural because some 1960s 
feminazis either misunderstood or didn't like the fact that (because 
of history) in English the gender-neutral singular happens to also be 
the male singular.


Happily for me, I can continue to write documents in a grammatically 
correct way, and no-one will read them and think I'm a grammar-nazi 
(or obstinate, or old-fashioned or whatever) because unless they're 
specifically looking for it no-one will notice that I'm avoiding the 
contentious usage altogether. On the other hand, there _will_ be a 
(perhaps significant) proportion of people who read your documents and 
think that you're incapable of writing a grammatically correct sentence.






Wow, 1960s feminazis, eh? I originally thought you were just a narrow 
minded, pedantic and antiquated grammarian. Now I realize that's the 
least of your troubles. Please take your misogyny elsewhere. I hear the 
Rabid Puppies have openings.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a funnel by any other name

2015-09-22 Thread Simon Riggs
On 17 September 2015 at 05:07, Nicolas Barbier 
wrote:

> 2015-09-17 Robert Haas :
>
> > 1. Exchange Bushy
> > 2. Exchange Inter-Operator (this is what's currently implemented)
> > 3. Exchange Replicate
> > 4. Exchange Merge
> > 5. Interchange
>
> > 1. ?
> > 2. Gather
> > 3. Broadcast (sorta)
> > 4. Gather Merge
> > 5. Redistribute
>
> > 1. Parallel Child
> > 2. Parallel Gather
> > 3. Parallel Replicate
> > 4. Parallel Merge
> > 5. Parallel Redistribute
>
> FYI, SQL Server has these in its execution plans:
>
> * Distribute Streams: read from one thread, write to multiple threads
> * Repartition Streams: both read and write from/to multiple threads
> * Gather Streams: read from multiple threads, write to one thread
>

Robert, thanks for asking. We'll be stuck with these words for some time,
user visible via EXPLAIN so this is important.

In general we should stick to words already used in other similar
situations, which could include DBMS and parallel ETL tools, of which there
are many more than mentioned here.

I would be against using any of these words: Funnel, Motion, Bushy because
I don't find them very descriptive (I think of spiders, bowels and shrubs
respectively, sorry).

These words are liable to confusion with other concepts: Replicate,
Duplicate, Distribute, Partition, Repartition, MERGE.

I've seen this concept called Fan-In/Fan-Out and Scatter/Gather

The main operations are the 3 mentioned by Nicolas:
1. Send data from many to one - which has subtypes for Unsorted, Sorted and
Evenly balanced (but unsorted)
2. Send data from one process to many
3. Send data from many to many

My preferences for this would be
1. Gather (but not Gather Motion) e.g. Gather, Gather Sorted
2. Scatter (since Broadcast only makes sense in the context of a
distributed query, it sounds weird for intra-node query)
3. Redistribution - which implies the description of how we spread data
across nodes is "Distribution" (or DISTRIBUTED BY)

For 3 we should definitely use Redistribute, since this is what Teradata
has been calling it for 30 years, which is where Greenplum got it from.
For 1, Gather makes most sense.

For 2, it could be either Scatter or Distribute. The former works well with
Gather, the latter works well with Redistribute.

Sorry for my absence for further review on parallel ops.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] row_security GUC, BYPASSRLS

2015-09-22 Thread Stephen Frost
* Noah Misch (n...@leadboat.com) wrote:
> On Mon, Sep 21, 2015 at 09:30:15AM -0400, Stephen Frost wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> > > Right now, if a BYPASSRLS user creates a SECURITY DEFINER function, any 
> > > caller
> > > can change that function's behavior by toggling the GUC.  Users won't test
> > > accordingly; better to have just one success-case behavior.
> > 
> > I agree that's not good, though the function definer could set the
> > row_security GUC on the function, no?  Similar to how we encourage
> > setting of search_path, we should be encouraging a similar approach to
> > anything which might be security relevant.
> 
> Functions can do that.  New features should not mimic search_path in their
> demands on SECURITY DEFINER authors.

I'm not convinced that having such a limitation on new features would be
a good thing.  What was missing here was a safe default.

> > Are you planning to handle the ALTER TABLE .. FORCE ROW SECURITY (#3) as
> > well?  I have no complaints if so; just want to make sure we aren't
> > doing double-work during this crunch time and didn't see your name
> > listed next to it, but the nearby thread seemed to imply you were
> > looking at it.
> 
> I'm not.

I'll work on it then, if we can get agreement as to how it will work.

> > One item which wasn't discussed, that I recall, is just how it will work
> > without SECURITY_ROW_LEVEL_DISABLED, or something similar, to
> > differentiate when internal referencial integrity queries are being run,
> > which should still bypass RLS (even in the FORCE ROW SECURITY case), and
> > when regular or SECURITY DEFINER originated queries are being run.
> 
> If the table owner enables FORCE ROW SECURITY, policies will affect
> referential integrity queries.  Choose policies accordingly.  For example,
> given only ON UPDATE NO ACTION constraints, it would be no problem to set
> owner-affecting policies for INSERT, UPDATE and/or DELETE.

Perhaps I'm not following correctly, but the above doesn't look correct
to me.  An ON UPDATE NO ACTION constraint would run a query against the
referring table (which has FORCE ROW SECURITY set, perhaps by mistake
after a debugging session of the owner, with a policy that does not
allow any records to be seen by the owner), fail to find any rows, and
conclude that no error needs to be thrown, resulting in the referring
table having records which refer to keys in the referred-to table that
no longer exist (the UPDATE having changed them).

As a test, I hacked the pg_class_ownercheck() case in check_enable_rls()
to return RLS_ENABLED always.  Then did this:

CREATE ROLE r1;
CREATE ROLE r2;
CREATE TABLE t1 (c1 INT PRIMARY KEY);
CREATE TABLE t2 (c1 INT REFERENCES t1 ON UPDATE NO ACTION);
ALTER TABLE t1 OWNER TO r1;
ALTER TABLE t2 OWNER TO r2;
GRANT SELECT ON t1 TO r2;
INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1);
ALTER TABLE t2 ENABLE ROW LEVEL SECURITY;
UPDATE t1 SET c1 = 2;
ALTER TABLE t2 DISABLE ROW LEVEL SECURITY;
TABLE t1;
TABLE t2;

With results:

=*# TABLE t1;
 c1 

  2
(1 row)

=*# TABLE t2;   
 
 c1 

  1
(1 row)

This would lead to trivial to cause (and likely hard to diagnose)
referential integrity data corruption issues.  I find that a very hard
pill to swallow in favor of a theoretical concern that new code may open
avenues of exploit for a new security context mode to bypass RLS when
doing internal referential integrity checks.  Further, it changes this
additional capability, which was agreed would be added to offset removal
of the 'row_security = force' option, from useful to downright
dangerous.

Hopefully, I'm simply misunderstanding your proposal for the
FORCE ROW LEVEL SECURITY option.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 09:25 AM, Geoff Winkless wrote:
 If someone sends me a document that uses "their" in a singular usage, 
I will think that person is lazy. That will continue to be the case, 
whether people tell me that it's accepted usage or not.



You can think that if you like, but it's not even remotely true. It's a 
deliberate choice to use a new, perfectly reasonable and now widely 
accepted style of which you disapprove, but it's not lazy.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY planning

2015-09-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Alvaro Herrera  writes:
> > I noticed that COPY calls planner() (this was introduced in 85188ab88).
> > I think it should be calling pg_plan_query() instead.
> 
> +1 --- AFAICS, this is the *only* place that is going directly to
> planner() without going through pg_plan_query(); other utility
> functions such as CREATE TABLE AS do the latter.
> 
> As far as the patch goes, do copy.c's #include's need adjustment?
> I'm wondering if optimizer/planner.h could be removed, in particular.

BeginCopyFrom still uses expression_planner(), at least..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Rework the way multixact truncations work

2015-09-22 Thread Andres Freund
On 2015-07-02 11:52:04 -0400, Robert Haas wrote:
> On Mon, Jun 29, 2015 at 3:48 PM, Andres Freund  wrote:
> > New version attached.
> 
> 0002 looks good, but the commit message should perhaps mention the
> comment fix.  Or commit that separately.

I'm inclined to backpatch the applicable parts to 9.0 - seems pointless
to have differing autovacuum_freeze_max_age values and the current value
sucks for testing and space consumption there as well.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 06:17 AM, Geoff Winkless wrote:
On 22 September 2015 at 10:52, Gavin Flower 
>wrote:


On 22/09/15 21:33, Geoff Winkless wrote:

​ ​
Without wanting to get into a grammar war, ​I'm not so sure I
agree that it "condones" it. Dictionaries reflect the current
state of usage, they don't act as arbiters of correctness. The
abuse of "literally" as an emphasiser (which usage is now
listed in the OED) is a prime example.

I would prefer "his or her" over "their". Perhaps our American
cousins might disagree though.

I prefer "their" rather than "his or her", it is less clumsy &
there is no point in specifying gender unless it is relevant!


I agree in that I prefer "their" in informal speech; however in a 
formal document I would find it sloppy.​ I don't think "his or her" is 
inherently clumsy; m

aybe I'm just showing my age.​

Besides, some people are neither, or their biological gender is
ambiguous - so a few people fit into neither the male nor the
female category (depending on precise definitions, about 0.5%)!


My understanding is that most intersex (and certainly all trans) 
people would identify with one or the other, and even those who don't 
select exclusively identify with a mix of both (and would therefore 
still be covered by "his or her", no?) although I don't pretend to be 
an expert.


Perhaps it would be easier to avoid the controversy by actually 
rewording into the plural, where possible?


So

"any user can make such a change for his session."

becomes

"Users can make such a change for their individual sessions"

or similar?





You are fighting a losing battle. Think of they/them/their/theirs as 
being indefinitely gendered third person singular pronouns, as well as 
being third person plural pronouns. Yes it's a relatively new usage, but 
I don't think its at all unreasonable (speaking as someone who has been 
known to dislike some new usages and neologisms). It's not at all 
sloppy. On the contrary, it's quite deliberate. It's just not quite 
traditional. You need to get over that.


Your proposed style would make writing docs a lot harder, forcing us to 
avoid use of the singular in cases where it is quite natural. I'm 
strongly opposed to such a style rule.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 12:32 AM, Erik Rijkers wrote:



-   might not be the same as the database user he needs to connect as.
+   might not be the same as the database user that is to be connect as.


It is not an improvement.  I would like to see this change rolled back.





I agree this is awkward.

I would use "might not be the same as the database user they need to 
connect as." Let's be consistent about the use of they/them/their/theirs 
as indefinitely gendered singular pronouns.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Use gender-neutral language in documentation

2015-09-22 Thread Geoff Winkless
Oh, good! We're actually going to have this argument? Even though I said I
don't care what you do?

On 22 September 2015 at 15:11, Andrew Dunstan  wrote:

> On 09/22/2015 09:25 AM, Geoff Winkless wrote:
>
>>  If someone sends me a document that uses "their" in a singular usage, I
>> will think that person is lazy. That will continue to be the case, whether
>> people tell me that it's accepted usage or not.
>
>
> You can think that if you like,


​Thanks!​ That's a great relief to me, as I'm sure you can imagine.


> but it's not even remotely true.


​You just stated that the reason you don't want to use the plural form I
suggested is because it's too hard/time-consuming. That does suggest you
accept that it's a valid solution but you're too lazy to use it.​


> It's a deliberate choice to use a new, perfectly reasonable and now widely
> accepted style of which you disapprove, but it's not lazy.
>

​That's your opinion; my opinion remains otherwise. It's not "perfectly
reasonable" to abuse the plural because some 1960s feminazis either
misunderstood or didn't like the fact that (because of history) in English
the gender-neutral singular happens to also be the male singular.

Happily for me, I can continue to write documents in a grammatically
correct way, and no-one will read them and think I'm a grammar-nazi (or
obstinate, or old-fashioned or whatever) because unless they're
specifically looking for it no-one will notice that I'm avoiding the
contentious usage altogether. On the other hand, there _will_ be a (perhaps
significant) proportion of people who read your documents and think that
you're incapable of writing a grammatically correct sentence.

Geoff


Re: [HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-22 Thread Robert Haas
On Tue, Sep 22, 2015 at 9:54 AM, Thom Brown  wrote:
> Hi,
>
> With 9.5 alpha 2 on Windows 8 (64-bit), trying to require SSL results
> in a blocking error:
>
> pg_hba.conf:
> hostssl   postgres   postgres   0.0.0.0/0   trust
>
> postgresql.conf:
> ssl=on
>
>
> C:\Program Files\PostgreSQL\9.5\bin>SET PGSSLMODE=require
>
> C:\Program Files\PostgreSQL\9.5\bin>psql.exe -p 5432 -U postgres -h
> localhost postgres
> psql: SSL SYSCALL error: Operation would block (0x2733/10035)
>
>
>
> Doing the same thing on Linux doesn't result in the error (I'm using
> latest Git master here):
>
> thom@swift:~/Development/data$ PGSSLMODE=require psql -U thom -p 5488
> -d postgres -h 127.0.0.1
> psql (9.6devel)
> SSL connection (protocol: TLSv1.2, cipher:
> ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
> Type "help" for help.
>
> postgres=#

This, by the way, was reported internally to EDB by our QA team.  My
understanding is that it doesn't happen on 9.4, but nobody knows yet
which 9.5 commit broke it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-22 Thread Andrew Dunstan



On 09/22/2015 10:57 AM, Robert Haas wrote:

On Tue, Sep 22, 2015 at 9:54 AM, Thom Brown  wrote:

Hi,

With 9.5 alpha 2 on Windows 8 (64-bit), trying to require SSL results
in a blocking error:

pg_hba.conf:
hostssl   postgres   postgres   0.0.0.0/0   trust

postgresql.conf:
ssl=on


C:\Program Files\PostgreSQL\9.5\bin>SET PGSSLMODE=require

C:\Program Files\PostgreSQL\9.5\bin>psql.exe -p 5432 -U postgres -h
localhost postgres
psql: SSL SYSCALL error: Operation would block (0x2733/10035)



Doing the same thing on Linux doesn't result in the error (I'm using
latest Git master here):

thom@swift:~/Development/data$ PGSSLMODE=require psql -U thom -p 5488
-d postgres -h 127.0.0.1
psql (9.6devel)
SSL connection (protocol: TLSv1.2, cipher:
ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=#

This, by the way, was reported internally to EDB by our QA team.  My
understanding is that it doesn't happen on 9.4, but nobody knows yet
which 9.5 commit broke it.



"git bisect" is your friend.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-22 Thread Syed, Rahila
Hello,

Please find attached patch with bugs reported by Thom and Sawada-san solved.

>* The progress of vacuum by autovacuum seems not to be displayed.
The progress is stored in shared variables during autovacuum. I guess the 
reason they are not visible is that the entries are deleted as soon as the 
process exits.
But the progress can be viewed while autovacuum worker is running.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v3.patch
Description: Vacuum_progress_checker_v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers