[HACKERS] Shared memory and processes

2016-04-27 Thread david
Does Postgres provide a convenient way for one process to pass data to another 
using shared memory?

 

Regards

David M Bennett FACS

  _  

Andl - A New Database Language - andl.org

 



Re: [HACKERS] Shared memory and processes

2016-04-27 Thread david
> From: Thomas Munro [mailto:thomas.mu...@enterprisedb.com]
> > Does Postgres provide a convenient way for one process to pass data to
> > another using shared memory?
> 
> 1.  Look at ShmemInitStruct, ShmemInitHash (as in hash table), ShmemInitQueue
> in storage/shmem.h.  These use memory that is mapped at the same address in
> every backend (process) which is convenient for sharing data structures with
> internal pointers.  The amount of memory available is fixed at cluster
> startup however.

Thanks. That limit could be an issue. The notes in shmem.c are helpful.

> 2.  Look at dsm_XXX in storage/dsm.h.  This subsystem provides segments of
> memory that is "dynamic" in the sense that it is limited only by your
> system's available virtual memory, but you have to explicitly attach these
> segment in any backend that wants to access them by passing a handle around
> and the memory may be mapped at any address in each backend, so you need to
> work harder to build data structures that reference each other (using offsets
> rather than pointers, that kind of thing).  DSM segments won't work well if
> you try to create large numbers of them, so you'll need to provide a way to
> manage the space inside a smallish number of large chunks of DSM memory
> yourself (this is a problem I'm working to fix by providing a general purpose
> allocator backed by a bunch of DSM segments -- watch this space).  LWLocks
> (our usual lock primitive for cases where spinlocks are inappropriate)
> currently don't work correctly inside DSM segments (this too will be fixed).

I've now found this through the test_shm_mq sample. Looks like an answer, if 
quite a bit of machinery needed.

Thanks for the pointers.


Regards
David M Bennett FACS

Andl - A New Database Language - andl.org





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shared memory and processes

2016-04-27 Thread david
> From: Michael Paquier [mailto:michael.paqu...@gmail.com]
> > Does Postgres provide a convenient way for one process to pass data to
> > another using shared memory?
> 
> If you are talking about enabling the use of shared memory by 3rd-part
> plugins or modules, there is shmem_startup_hook for this purpose.

Thanks -- looks interesting.

> Another thing that you may want to look at is DSM (dynamic shared memory),
> with its interface in dsm.h. An example of use is in 
> src/test/modules/test_shm_mq.

Yes, that is useful. I'm curious why the documentation (F.41) was removed 
between 9.4 and 9.5 -- is there a problem?

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] About subxact and xact nesting level...

2016-05-01 Thread david
Still trying to find my way around the source code…

 

The file xact.c contains references to sub-transactions (subxact) and 
transaction nesting level, but no obvious documentation about what these 
correspond to in SQL. A search shows that plpython supports something called 
“proper sub transactions”. There are random mentions of subtransactions in the 
release notes, but nothing substantive that I can find, and nothing about 
transaction nesting.

 

Any pointers to docs or help to understand much appreciated.

 

Regards

David M Bennett FACS

  _  

Andl - A New Database Language - andl.org

 

 



Re: [HACKERS] About subxact and xact nesting level...

2016-05-02 Thread david
> From: Thomas Munro [mailto:thomas.mu...@enterprisedb.com]

> > The file xact.c contains references to sub-transactions (subxact) and
> > transaction nesting level, but no obvious documentation about what
> > these correspond to in SQL. A search shows that plpython supports
> > something called “proper sub transactions”. There are random mentions
> > of subtransactions in the release notes, but nothing substantive that
> > I can find, and nothing about transaction nesting.
> >
> > Any pointers to docs or help to understand much appreciated.
> 
> Subtransactions are used to implement SAVEPOINT, and also BEGIN blocks with
> EXCEPTION clauses in plpgsql.
> 
> http://www.postgresql.org/docs/9.5/static/sql-savepoint.html
> http://www.postgresql.org/docs/9.5/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING

Thanks. I guess that explains the nesting level too. It seems there is an 
internal API based on:
* BeginInternalSubTransaction
* RollbackAndReleaseCurrentSubTransaction
* ReleaseCurrentSubTransaction

This looks like something I shall need to use. I have the plandl language 
handler all working, and understanding the transaction environment is turning 
out to be a major challenge. 

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PERFORM] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread david

On Tue, 29 Dec 2009, Greg Stark wrote:


On Mon, Dec 28, 2009 at 10:54 PM, Andres Freund  wrote:

fsync everything in that pass.
Including the directory - which was not done before and actually might be
necessary in some cases.


Er. Yes. At least on ext4 this is pretty important. I wish it weren't,
but it doesn't look like we're going to convince the ext4 developers
they're crazy any day soon and it would really suck for a database
created from a template to have files in it go missin.


actually, as I understand it you need to do this on all filesystems except 
ext3, and on ext3 fsync is horribly slow because it writes out 
_everything_ that's pending, not just stuff related to the file you do the 
fsync on.


David Lang

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PERFORM] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread david

On Tue, 29 Dec 2009, Andres Freund wrote:


On Tuesday 29 December 2009 01:30:17 da...@lang.hm wrote:

On Tue, 29 Dec 2009, Greg Stark wrote:

On Mon, Dec 28, 2009 at 10:54 PM, Andres Freund 

wrote:

fsync everything in that pass.
Including the directory - which was not done before and actually might
be necessary in some cases.


Er. Yes. At least on ext4 this is pretty important. I wish it weren't,
but it doesn't look like we're going to convince the ext4 developers
they're crazy any day soon and it would really suck for a database
created from a template to have files in it go missin.


actually, as I understand it you need to do this on all filesystems except
ext3, and on ext3 fsync is horribly slow because it writes out
_everything_ that's pending, not just stuff related to the file you do the
fsync on.

I dont think its all filesystems (ext2 should not be affected...), but generally
youre right. At least jfs, xfs are affected as well.


ext2 definantly needs the fsync on the directory as well as the file 
(well, if the file metadata like size, change)



Its btw not necessarily nearly-safe and slow on ext3 as well (data=writeback).


no, then it's just unsafe and slow ;-)

David Lang

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 2016-03 Commitfest Manager

2016-03-01 Thread David Steele
I volunteered a while back to be the CFM and I haven't seen any other
volunteers or objections to my offer.

I am still ready, eager, and willing!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Addition of extra commit fest entry to park future patches

2016-03-01 Thread David Steele
On 3/1/16 3:28 PM, Magnus Hagander wrote:
> On Tue, Mar 1, 2016 at 10:12 AM, Alvaro Herrera
> Magnus Hagander wrote:
>
> > Yeah, we can do that. I'd suggest we either name it based on the current
> > tentative date for CF1 (september), or name it specificaly "9.7-first" 
> or
> > something like that rather than just plain "future", to make it more 
> clear.
> 
> +1 to both names suggested by Magnus.
> 
> 
> 
> We do need to pick one of them :)

I'm good with 9.7-first.  I presume it can be renamed later to fit the
standard scheme?

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Addition of extra commit fest entry to park future patches

2016-03-01 Thread David Steele
On 3/1/16 3:35 PM, Kevin Grittner wrote:
> On Tue, Mar 1, 2016 at 2:28 PM, Magnus Hagander  wrote:
>> On Tue, Mar 1, 2016 at 10:12 AM, Alvaro Herrera  
>> wrote:
>>> Magnus Hagander wrote:
> 
>>>> I'd suggest we either name it based on the current tentative
>>>> date for CF1 (september), or name it specificaly "9.7-first" or
>>>> something like that rather than just plain "future", to make it
>>>> more clear.
>>>
>>> +1 to both names suggested by Magnus.
>>
>> We do need to pick one of them :)
>>
>> Anybody else with preferences?
> 
> I would prefer to see a consistent namimg pattern (i.e., 2016-09)
> and rename it if we reschedule.

I'm fine with that - it does help set expectations.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] 2016-03 Commitfest Manager

2016-03-01 Thread David Steele
On 3/1/16 3:04 PM, Tom Lane wrote:
> David Steele  writes:
>> I volunteered a while back to be the CFM and I haven't seen any other
>> volunteers or objections to my offer.
> 
>> I am still ready, eager, and willing!
> 
> I haven't heard any other volunteers either.  You have the conn, sir.

Aye aye!  Once the 2016-03 CF has been closed I'll send out a kickoff
report.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] 2016-03 Commitfest Manager

2016-03-01 Thread David Steele
On 3/1/16 8:49 PM, Michael Paquier wrote:
> On Wed, Mar 2, 2016 at 7:22 AM, Michael Paquier
>  wrote:
>> On Wed, Mar 2, 2016 at 6:15 AM, David Steele  wrote:
>>> On 3/1/16 3:04 PM, Tom Lane wrote:
>>>> David Steele  writes:
>>>>> I volunteered a while back to be the CFM and I haven't seen any other
>>>>> volunteers or objections to my offer.
>>>>
>>>>> I am still ready, eager, and willing!
>>>>
>>>> I haven't heard any other volunteers either.  You have the conn, sir.
>>>
>>> Aye aye!  Once the 2016-03 CF has been closed I'll send out a kickoff
>>> report.
>>
>> Magnus, should David have administration rights in the CF app? I
>> believe that he cannot change the CF status now. We should switch to
>> "In Progress". I can do that at least.
> 
> I am planning to switch 2016-03 from "Open" to "Processing" in a
> couple of hours, at which stage nobody will be able to register new
> patches to it. I guess it's more than time.

Agreed.  I see you created the new CF so no reason to keep it open.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] 2016-03 Commitfest Manager

2016-03-01 Thread David Fetter
On Wed, Mar 02, 2016 at 10:49:01AM +0900, Michael Paquier wrote:
> On Wed, Mar 2, 2016 at 7:22 AM, Michael Paquier
>  wrote:
> > On Wed, Mar 2, 2016 at 6:15 AM, David Steele  wrote:
> >> On 3/1/16 3:04 PM, Tom Lane wrote:
> >>> David Steele  writes:
> >>>> I volunteered a while back to be the CFM and I haven't seen any other
> >>>> volunteers or objections to my offer.
> >>>
> >>>> I am still ready, eager, and willing!
> >>>
> >>> I haven't heard any other volunteers either.  You have the conn, sir.
> >>
> >> Aye aye!  Once the 2016-03 CF has been closed I'll send out a kickoff
> >> report.
> >
> > Magnus, should David have administration rights in the CF app? I
> > believe that he cannot change the CF status now. We should switch to
> > "In Progress". I can do that at least.
> 
> I am planning to switch 2016-03 from "Open" to "Processing" in a
> couple of hours, at which stage nobody will be able to register new
> patches to it. I guess it's more than time.

How do I get my weighted stats patch in?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 2016-03 Commitfest Manager

2016-03-01 Thread David Fetter
On Wed, Mar 02, 2016 at 03:34:39PM +0900, Michael Paquier wrote:
> On Wed, Mar 2, 2016 at 2:53 PM, David Fetter  wrote:
> > On Wed, Mar 02, 2016 at 10:49:01AM +0900, Michael Paquier wrote:
> >> On Wed, Mar 2, 2016 at 7:22 AM, Michael Paquier
> >>  wrote:
> >> > On Wed, Mar 2, 2016 at 6:15 AM, David Steele  wrote:
> >> >> On 3/1/16 3:04 PM, Tom Lane wrote:
> >> >>> David Steele  writes:
> >> >>>> I volunteered a while back to be the CFM and I haven't seen any other
> >> >>>> volunteers or objections to my offer.
> >> >>>
> >> >>>> I am still ready, eager, and willing!
> >> >>>
> >> >>> I haven't heard any other volunteers either.  You have the conn, sir.
> >> >>
> >> >> Aye aye!  Once the 2016-03 CF has been closed I'll send out a kickoff
> >> >> report.
> >> >
> >> > Magnus, should David have administration rights in the CF app? I
> >> > believe that he cannot change the CF status now. We should switch to
> >> > "In Progress". I can do that at least.
> >>
> >> I am planning to switch 2016-03 from "Open" to "Processing" in a
> >> couple of hours, at which stage nobody will be able to register new
> >> patches to it. I guess it's more than time.
> >
> > How do I get my weighted stats patch in?
> 
> Why didn't you register it before and why didn't you send a new
> version yet? Looking at this stuff now the last activity for this
> patch was the 12th of January, with no new version, just a mention
> that the patch will be fixed:
> http://www.postgresql.org/message-id/cajrrpgf1kt-5bn+rpunxmkcigxnqs5_ratyqseun-xff9h+...@mail.gmail.com

I have been pretty busy with some high-priority family and work stuff,
including a terminal illness (not mine) and a new job.

None of that is a reason to go around the CF process.  I'll see
whether I can wheedle a committer's time on this once I have the new
patch. If I can't, I'll queue it for the next one.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 2016-03 Commitfest

2016-03-02 Thread David Steele
The 2016-03 commitfest is officially in progress!

There are currently a lot of patches waiting for review but with no
reviewers:

Needs review: 97
Needs *reviewer*: 58

Please check the "needs reviewer" list
(https://commitfest.postgresql.org/9/?reviewer=-2) for patches to
review.  The committers need our help to work through this enormous load
of patches.

If this is you:

Waiting on Author: 16

Then please post an updated patch as soon as possible so it can be
reviewed.  Some of these patches have not seen any activity from the
author in a long time.

The good news is we are already 14% done with the CF:

Committed: 17
Rejected: 2
Returned with Feedback: 1

I'll post a status update on this thread at least once a week and more
often as needed.  Going forward there will be more detail on individual
patches that are not making progress for whatever reason.

Let's get reviewing!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread David Steele
On 3/2/16 11:10 AM, Shulgin, Oleksandr wrote:
> On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra
> mailto:tomas.von...@2ndquadrant.com>> wrote:
> 
> I think it'd be useful not to have all the changes in one lump, but
> structure this as a patch series with related changes in separate
> chunks. I doubt we'd like to mix the changes in a single commit, and
> it makes the reviews and reasoning easier. So those NULL-handling
> fixes should be in one patch, the MCV patches in another one.
> 
> 
> OK, such a split would make sense to me.  Though, I'm a bit late as the
> commitfest is already closed to new patches, I guess asking the CF
> manager to split this might work (assuming I produce the patch files)?

If the patch is broken into two files that gives the review/committer
more options but I don't think it requires another CF entry.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread David Rowley
On 3 March 2016 at 04:29, Alvaro Herrera  wrote:
> Simon Riggs wrote:
>
>> ISTM that we are clearly "going for it"; everybody agrees we should apply
>> the patch now.
>>
>> The longer we hold off on applying it, the longer we wait for dependent
>> changes.
>
> Agreed -- we need this in tree as soon as realistically possible.
>
> There is a a bit a problem here, because this patch conflicts heavily
> with at least one other patch that's been in the queue for a long time,
> which is Kommi/Rowley's patch for parallel aggregation; the more we
> delay applying this one, the worse the deadlines for that one.
>
> I assume they are hard at work updating that patch to apply on top of
> Tom's patch.  It's not realistic to expect that we would apply any
> further planner changes before this one is in.

I agree that it would be good to get this in as soon as possible. I'm
currently very close to being done with writing Parallel Aggregate on
top of the upper planner changes. So far this version is much cleaner
as there's less cruft added compared with the other version, of which
would need to be removed again after the upper planner changes are in
anyway. Putting parallel aggregate in first would be asking Tom to
re-invent parallel aggregate when he rebases the upper planner stuff
on the new master branch, which makes very little sense.

So I agree that it would be nice to get the upper planner changes in
first, but soon! not at the end of March 'fest, as doing so would most
likely kill parallel aggregate for 9.6, and I kinda think that would
be silly as (I think) it's pretty much the biggest missing piece of
the parallel query set.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-03 Thread David Rowley
On 3 March 2016 at 18:04, Tom Lane  wrote:
> David Rowley  writes:
>> I agree that it would be good to get this in as soon as possible. I'm
>> currently very close to being done with writing Parallel Aggregate on
>> top of the upper planner changes. So far this version is much cleaner
>> as there's less cruft added compared with the other version,
>
> Cool!  If you come across any points where it seems like it could be
> done better or more easily, that would be tremendously valuable feedback
> at this stage.

Well since you mention it, I started on hash grouping and it was
rather simple and clean as in create_agg_path() I just created a chain
of the required paths and let create_plan() recursively build the
Finalize Aggregate -> Gather -> Partial Aggregate -> Seq Scan plan.
That was rather simple, and actually very nice when compared to how
things are handled in today's grouping planner. When it comes to Group
Aggregate I notice that you're using a RollupPath rather than an
AggPath even when there's no grouping sets. This also means that
create_agg_path() is only ever used for the AGG_HASHED strategy, even
though the 'strategy' is passed as a parameter to that function, so it
seemed prudent to me, to make sure all strategies are handled properly
there.

My gripe is that I've added the required code to build the parallel
group aggregate to create_agg_path() already, but since Group
Aggregate uses the RollupPath I'm forced to add code in
create_rollup_plan() which manually stacks up Plan nodes rather than
just dealing with Paths and create_plan() and its recursive call
magic.

I can't quite see any blocker for not doing this, so would you object
to separating out the treatment of Group Aggregate and Grouping Sets
in create_grouping_paths() ? I think it would require less code
overall.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-03 Thread David Rowley
On 3 March 2016 at 22:57, David Rowley  wrote:
> On 3 March 2016 at 18:04, Tom Lane  wrote:
>> If you come across any points where it seems like it could be
>> done better or more easily, that would be tremendously valuable feedback
>> at this stage.
>
> Well since you mention it, I started on hash grouping and it was
> rather simple and clean as in create_agg_path() I just created a chain
> of the required paths and let create_plan() recursively build the
> Finalize Aggregate -> Gather -> Partial Aggregate -> Seq Scan plan.
> That was rather simple, and actually very nice when compared to how
> things are handled in today's grouping planner. When it comes to Group
> Aggregate I notice that you're using a RollupPath rather than an
> AggPath even when there's no grouping sets. This also means that
> create_agg_path() is only ever used for the AGG_HASHED strategy, even
> though the 'strategy' is passed as a parameter to that function, so it
> seemed prudent to me, to make sure all strategies are handled properly
> there.
>
> My gripe is that I've added the required code to build the parallel
> group aggregate to create_agg_path() already, but since Group
> Aggregate uses the RollupPath I'm forced to add code in
> create_rollup_plan() which manually stacks up Plan nodes rather than
> just dealing with Paths and create_plan() and its recursive call
> magic.
>
> I can't quite see any blocker for not doing this, so would you object
> to separating out the treatment of Group Aggregate and Grouping Sets
> in create_grouping_paths() ? I think it would require less code
> overall.

Actually I might have jumped the gun a little here with my complaint.
I think it does not matter about this as I've just coded Parallel
Group Aggregate part to reuse create_agg_path(), but left the
non-parallel version to make use of create_rollup_path(). I don't
think I want to go to the trouble of parallel grouping sets at this
stage anyway.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-03 Thread David Rowley
On 4 March 2016 at 09:29, Robert Haas  wrote:
> On Thu, Mar 3, 2016 at 2:19 PM, Tom Lane  wrote:
>> This leads me to the conclusion that all these new node types should
>> set parallel_aware to false and copy up the other two fields from the
>> child, except for LockRows and ModifyTable which should set them all
>> to false/0.  Correct?  If so, I'll go fix.
>
> Well, I'd probably bubble it up regardless.  The fact that the overall
> plan writes data will cause everything in the plan to have
> parallel_safe = false and parallel_degree = 0, so you'll get the same
> outcome either way.  However, that way, if writes eventually become
> safe, then this won't need adjusting.  But it doesn't really matter
> much; feel free to do it as you say if you prefer.

This would help me too. I hit a problem with this when adding Group
Parallel Aggregate, as the sort path is conditionally added in
create_grouping_paths() which causes the parallel_degree for the
subpath which is later passed into create_agg_path() to become 0. in
create_agg_path() I was using the parallel_degree variable to
determine if I should construct a normal serial agg path, or a chain
of partial agg -> gather -> final agg paths. To get around the problem
I added a parallel_degree parameter to create_agg_path() of which that
parameter so far seems to only exist in create_seqscan_path(), which
naturally is a bit special as it's a "leaf node" in the path tree.
Propagating these would mean I could remove that parameter again.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-03 Thread David Rowley
On 17 February 2016 at 17:50, Haribabu Kommi  wrote:
> Here I attached a draft patch based on previous discussions. It still needs
> better comments and optimization.

Over in [1] Tom posted a large change to the grouping planner which
causes large conflict with the parallel aggregation patch. I've been
looking over Tom's patch and reading the related thread and I've
observed 3 things:

1. Parallel Aggregate will be much easier to write and less code to
base it up top of Tom's upper planner changes. The latest patch does
add a bit of cruft (e.g create_gather_plan_from_subplan()) which won't
be required after Tom pushes the changes to the upper planner.
2. If we apply parallel aggregate before Tom's upper planner changes
go in, then Tom needs to reinvent it again when rebasing his patch.
This seems senseless, so this is why I did this work.
3. Based on the thread, most people are leaning towards getting Tom's
changes in early to allow a bit more settle time before beta, and
perhaps also to allow other patches to go in after (e.g this)

So, I've done a bit of work and I've rewritten the parallel aggregate
code to base it on top of Tom's patch posted in [1]. There's a few
things that are left unsolved at this stage.

1. exprType() for Aggref still returns the aggtype, where it needs to
return the trans type for partial agg nodes, this need to return the
trans type rather than the aggtype. I had thought I might fix this by
adding a proxy node type that sits in the targetlist until setrefs.c
where it can be plucked out and replaced by the Aggref. I need to
investigate this further.
2. There's an outstanding bug relating to HAVING clause not seeing the
right state of aggregation and returning wrong results. I've not had
much time to look into this yet, but I suspect its an existing bug
that's already in master from my combine aggregate patch. I will
investigate this on Sunday.

In regards to the patch, there's a few things worth mentioning here:

1. I've had to add a parallel_degree parameter to create_group_path()
and create_agg_path(). I think Tom is going to make changes to his
patch so that the Path's parallel_degree is propagated to subnodes,
this should allow me to remove this parameter and just use
parallel_degree the one from the subpath.
2. I had to add a new parameter to pass an optional row estimate to
cost_gather() as I don't have a RelOptInfo available to get a row
estimate from which represents the state after partial aggregation. I
thought this change was ok, but I'll listen to anyone who thinks of a
better way to do it.
3. The code never attempts to mix and match Grouping Agg and Hash Agg
plans. e.g it could be an idea to perform Partial Hash Aggregate ->
Gather -> Sort -> Finalize Group Aggregate, or hash as in the Finalize
stage. I just thought doing this is more complex than what's really
needed, but if someone can think of a case where this would be a great
win then I'll listen, but you have to remember we don't have any
pre-sorted partial paths at this stage, so an explicit sort is
required *always*. This might change if someone invented partial btree
index scans... but until then...

Due to the existence of the outstanding issues above, I feel like I
might be posting the patch a little earlier, but wanted to do so since
this is quite a hot area in the code at the moment and I wanted to
post for transparency.

To apply the patch please apply [1] first.

[1] http://www.postgresql.org/message-id/3795.1456689...@sss.pgh.pa.us

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


parallel_aggregation_d6850a9f_2016-03-04.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add generate_series(date, date) and generate_series(date, date, integer)

2016-03-04 Thread David Steele
On 2/21/16 2:24 PM, Vik Fearing wrote:
> On 02/21/2016 07:56 PM, Corey Huinker wrote:
>>
>> Other than that, the only difference is the ::date part. Is it
>> really worth adding extra code just for that? I would say not.
>>
>>
>> I would argue it belongs for the sake of completeness. 
> 
> So would I.
> 
> +1 for adding this missing function.

+1. FWIW, a sample query I wrote for a customer yesterday would have
looked nicer with this function. Here's how the generate_series looked:

generate_series('2016-03-01'::date, '2016-03-31'::date, interval '1
day')::date

But it would have been cleaner to write:

generate_series('2016-03-01'::date, '2016-03-31'::date)

More importantly, though, I don't like that the timestamp version of the
function happily takes date parameters but returns timestamps. I feel
this could lead to some subtle bugs for the unwary.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-05 Thread David Rowley
On 5 March 2016 at 10:43, Alvaro Herrera  wrote:
> I wonder why do we have two identical copies of clause_sides_match_join ...

Yeah, I noticed the same a while back, and posted about it. Here was
the response:
http://www.postgresql.org/message-id/26820.1405522...@sss.pgh.pa.us


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-06 Thread David Rowley
On 7 March 2016 at 18:19, Haribabu Kommi  wrote:
> Here I attached update patch with further changes,
> 1. Explain plan changes for parallel grouping

Perhaps someone might disagree with me, but I'm not all that sure I
really get the need for that. With nodeAgg.c we're doing something
fundamentally different in Partial mode as we are in Finalize mode,
that's why I wanted to give an indication of that in the explain.c
originally. A query with no Aggregate functions using nodeGroup.c
there is no special handling in the executor for partial and final
stages, so I really don't see why we need to give the impression that
there is in EXPLAIN.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (pgaudit) Audit log is not output after the SET ROLE.

2016-03-07 Thread David Steele
On 3/7/16 4:39 AM, Toshi Harada wrote:
> 
> I am testing the pgaudit(https://commitfest.postgresql.org/9/463/).
> (use "http://www.postgresql.org/message-id/56b0101b.6070...@pgmasters.net"; 
> attached patch on 9.6-devel)
> 
> I found strange thing.
> 
> - After SET ROLE, part of the SQL is not the audit log output.
> - SQL comprising a relation is not output to the audit log.
> 
> * Reproduce:
> ** prepare
> 
> createuser test_user -U postgres
> createdb test -U postgres -O test_user
> psql test -U test_user -c "CREATE TABLE team(id int, name text)"
> 
> ** pgaudit settings
> 
> shared_preload_libraries = 'pgaudit'
> pgaudit.log = 'all'

Both of these are in postgresql.conf?

> ** test sql script (test.sql)
> 
> SELECT 1;
> SELECT * FROM team; -- output audit log
> SET ROLE test_user;
> SELECT 2;
> SELECT * FROM team; -- no output audit log
> SELECT 3;
> RESET ROLE;
> SELECT * FROM team; -- output audit log
> 
> ** run script
> 
> psql test -U postgres -f test.sql
> 
> ** audit log
> 
> LOG:  AUDIT: SESSION,1,1,READ,SELECT,,,SELECT 1;,
> LOG:  AUDIT: SESSION,2,1,READ,SELECT,,,SELECT * FROM team;,
> LOG:  AUDIT: SESSION,3,1,MISC,SET,,,SET ROLE test_user;,
> LOG:  AUDIT: SESSION,4,1,READ,SELECT,,,SELECT 2;,
> LOG:  AUDIT: SESSION,5,1,READ,SELECT,,,SELECT 3;,
> LOG:  AUDIT: SESSION,6,1,MISC,RESET,,,RESET ROLE;,
> LOG:  AUDIT: SESSION,7,1,READ,SELECT,,,SELECT * FROM team;,

Well, that definitely doesn't look right.

You may have noticed that the pgaudit patch is marked as "returned with
feedback" so it is closed for the current commitfest and will not be
included in 9.6.

I'll definitely look at this bug but I would ask that you resubmit it at
https://github.com/pgaudit/pgaudit/issues so we can continue the
conversation there.

Thanks!
-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] WIP: Upper planner pathification

2016-03-07 Thread David Rowley
On 8 March 2016 at 10:01, Tom Lane  wrote:
> I've pushed it now; we'll soon see if the buildfarm finds any problems.

Fantastic! I'm looking forward to all the future optimisation
opportunities that this opens up.
Thanks for making this happen.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-07 Thread David Rowley
On 5 March 2016 at 07:25, Robert Haas  wrote:
> On Thu, Mar 3, 2016 at 11:00 PM, David Rowley
>> 3. The code never attempts to mix and match Grouping Agg and Hash Agg
>> plans. e.g it could be an idea to perform Partial Hash Aggregate ->
>> Gather -> Sort -> Finalize Group Aggregate, or hash as in the Finalize
>> stage. I just thought doing this is more complex than what's really
>> needed, but if someone can think of a case where this would be a great
>> win then I'll listen, but you have to remember we don't have any
>> pre-sorted partial paths at this stage, so an explicit sort is
>> required *always*. This might change if someone invented partial btree
>> index scans... but until then...
>
> Actually, Rahila Syed is working on that.  But it's not done yet, so
> presumably will not go into 9.6.
>
> I don't really see the logic of this, though.  Currently, Gather
> destroys the input ordering, so it seems preferable for the
> finalize-aggregates stage to use a hash aggregate whenever possible,
> whatever the partial-aggregate stage did.  Otherwise, we need an
> explicit sort.  Anyway, it seems like the two stages should be costed
> and decided on their own merits - there's no reason to chain the two
> decisions together.

Thanks for looking at this.
I've attached an updated patch which re-bases the whole patch on top
of the upper planner changes which have just been committed.
In this version create_grouping_paths() does now consider mixed
strategies of hashed and sorted, although I have a few concerns with
the code that I've written. I'm solely posting this early to minimise
any duplicate work.

My concerns are:
1. Since there's no cheapest_partial_path in RelOptInfo the code is
currently considering every partial_path for parallel hash aggregate.
With normal aggregation we only ever use the cheapest path, so this
may not be future proof. As of today we do only have at most one
partial path in the list, but there's no reason to code this with that
assumption. I didn't put in much effort to improve this as I see code
in generate_gather_paths() which also makes assumptions about there
just being 1 partial path. Perhaps we should expand RelOptInfo to
track the cheapest partial path? or maybe allpaths.c should have a
function to fetch the cheapest out of the list?

2. In mixed parallel aggregate mode, when the query has no aggregate
functions, the code currently will use a nodeAgg for AGG_SORTED
strategy rather than a nodeGroup, as it would in serial agg mode. This
probably needs to be changed.

3. Nothing in create_grouping_paths() looks at the force_parallel_mode
GUC. I had a quick look at this GUC and was a bit surprised to see 3
possible states, but no explanation of what they do, so I've not added
code which pays attention to this setting yet. I'd imagine this is
just a matter of skipping serial path generation when parallel is
possible when force_parallel_mode is FORCE_PARALLEL_ON. I've no idea
what FORCE_PARALLEL_REGRESS is for yet.

The setrefs.c parts of the patch remain completely broken. I've not
had time to look at this again yet, sorry.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


parallel_aggregation_cc75f61_2016-03-08.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix misspelling of "parallel"

2016-03-07 Thread David Rowley
The attached fixes a small spelling error in a comment.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


parallel_spelling_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Typo in logicaldecoding docs

2016-03-07 Thread David Rowley
The attached fixes a small error in the logicaldecoding docs.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


logicaldecoding_docs_typo_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-08 Thread David Rowley
On 9 March 2016 at 04:06, Robert Haas  wrote:
> On Mon, Mar 7, 2016 at 5:15 PM, David Rowley
>  wrote:
>> My concerns are:
>> 1. Since there's no cheapest_partial_path in RelOptInfo the code is
>> currently considering every partial_path for parallel hash aggregate.
>> With normal aggregation we only ever use the cheapest path, so this
>> may not be future proof. As of today we do only have at most one
>> partial path in the list, but there's no reason to code this with that
>> assumption. I didn't put in much effort to improve this as I see code
>> in generate_gather_paths() which also makes assumptions about there
>> just being 1 partial path. Perhaps we should expand RelOptInfo to
>> track the cheapest partial path? or maybe allpaths.c should have a
>> function to fetch the cheapest out of the list?
>
> The first one in the list will be the cheapest; why not just look at
> that?  Sorted partial paths are interesting if some subsequent path
> construction step can make use of that sort ordering, but they're
> never interesting from the point of view of matching the query's
> pathkeys because of the fact that Gather is order-destroying.

In this case a sorted partial path is useful as the partial agg node
sits below Gather. The sorted input is very interesting for the
partial agg node with a strategy of AGG_SORTED. In most cases with
parallel aggregate it's the partial stage that will take the most
time, so if we do get pre-sorted partial paths, this will be very good
indeed for parallel agg.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-08 Thread David Rowley
On 23 January 2016 at 05:36, Tomas Vondra  wrote:
> Hi,
>
> On 12/17/2015 02:17 PM, David Rowley wrote:
>>
>> On 17 December 2015 at 19:11, Simon Riggs > <mailto:si...@2ndquadrant.com>> wrote:
>>
>> On 17 December 2015 at 00:17, Tomas Vondra
>> mailto:tomas.von...@2ndquadrant.com>>
>> wrote:
>>
>> I'd go with match_first_tuple_only.
>>
>>
>> +1
>>
>> unique_inner is a state that has been detected,
>> match_first_tuple_only is the action we take as a result.
>>
>>
>> Ok great. I've made it so in the attached. This means the comment in the
>> join code where we perform the skip can be a bit less verbose and all
>> the details can go in where we're actually setting the
>> match_first_tuple_only to true.
>
>
> OK. I've looked at the patch again today, and it seems broken bv 45be99f8 as
> the partial paths were not passing the unique_inner to the create_*_path()
> functions. The attached patch should fix that.
>
> Otherwise I think the patch is ready for committer - I think this is a
> valuable optimization and I see nothing wrong with the code.

I've attached an updated patch which updates it to fix the conflicts
with the recent upper planner changes.
I also notice that some regression tests, which I think some of which
Tom updated in the upper planner changes have now changed back again
due to the slightly reduced costs on hash and nested loop joins where
the inner side is unique. I checked the costs of one of these by
disabling hash join and noticed that the final totla cost is the same,
so it's not too surprising that they keep switching plans with these
planner changes going in. I verified that these remain as is when I
comment out the cost changing code in this patch.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


unique_joins_dbcecda_2016-03-09.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-08 Thread David Rowley
On 9 March 2016 at 13:19, Tom Lane  wrote:
> I do think that the verbosity this adds to the EXPLAIN output is not
> desirable at all, at least not in text mode.  Another line for every
> darn join is a pretty high price.

For me it seems like a good idea to give some sort of indication in
EXPLAIN about this, although I'm not wedded to what I have currently
as being the best and neatest way to do this.
I was however under the impression that you thought this was
reasonable [1], so I didn't go seeking out any other way. Did you have
another idea, or are you just proposing to remove it for text EXPLAIN?
Perhaps, if you are then the tests can still exist with a non-text
output.

Also in [1], I disagree with your proposal for being inconsistent with
what's shown with VERBOSE depending on the EXPLAIN output format. If
we were going to do that then I'd rather just switch on verbose for
non-text, but that might make some people unhappy, so I'd rather we
didn't do that either. So how about I remove it from the TEXT output
and just include it in non-text formats when the VERBOSE flag is set?
then change the tests to use... something other than text... YAML
seems most compact.

[1] http://www.postgresql.org/message-id/8907.1440383...@sss.pgh.pa.us

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-09 Thread David Steele
Hi Robbie,

On 3/8/16 5:44 PM, Robbie Harwood wrote:
> Hello friends,
> 
> Here's yet another version of GSSAPI encryption support.  It's also
> available for viewing on my github:

I got this warning when applying the first patch in the set:

../other/v6-0001-Move-common-GSSAPI-code-into-its-own-files.patch:245:
new blank line at EOF.
+
warning: 1 line adds whitespace errors.

I know it's minor but I'm always happier when patches apply cleanly.

The build went fine but when testing I was unable to logon at all.  I'm
using the same methodology as in
http://www.postgresql.org/message-id/56be0ff9.70...@pgmasters.net except
that I'm running against 51c0f63 and using the v6 patch set.

psql simply hangs and never returns.  I have attached a pcap of the
psql/postgres session generated with:

tcpdump -i lo -nnvvXSs 1514 port 5432 -w gssapi.pcap

If you would like me to capture more information please let me know
specifically how you would like me to capture it.

I reverted to v5 and got the same behavior I was seeing with v4 and v5,
namely that I can only logon occasionally and usually get this error:

psql: expected authentication request from server, but received

Using a fresh build from 51c0f63 I can logon reliably every time so I
don't think there's an issue in my environment.

-- 
-David
da...@pgmasters.net


gssapi.pcap
Description: Binary data


signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] HINTing on UPDATE foo SET foo.bar = ..;

2016-03-09 Thread David Steele

On 12/23/15 9:15 PM, Michael Paquier wrote:

On Mon, Dec 7, 2015 at 9:14 PM, Michael Paquier
 wrote:

On Wed, Sep 2, 2015 at 6:19 AM, Marko Tiikkaja  wrote:

Hopefully nobody minds if I slip this to the commit fest that started
today?  The attached patch should address all the comments from the 9.5
cycle.

All the previous comments have been addressed. Perhaps some regression tests
would have some value?

This thread has been stalling for a couple of weeks now. I have marked
it as "returned with feedback". Marko, if you are still working on
this patch, could you add some regression tests and then move it to
the next CF?
This was submitted to the 2016-03 CF but no new patch was supplied and 
there's been no activity on the thread for months.  I'm marking it as 
"returned with feedback."


Marko, if can address Michael's concerns or supply a new patch I'll be 
happy to open the CF entry again.


Thanks,

--
-David
da...@pgmasters.net



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-03-09 Thread David Steele

Hi Victor,

On 2/1/16 5:05 PM, Alvaro Herrera wrote:

Victor Wagner wrote:

On Fri, 22 Jan 2016 16:36:15 -0300
Alvaro Herrera  wrote:


You're editing the expected file for the libpq-regress thingy, but you
haven't added any new lines to test the new capability.  I think it'd
be good to add some there.  (I already said this earlier in the
thread; is there any reason you ignored it the first time?)

I seriously doubt that this program can be used to test new
capabilities.

All it does, it calls PQconninfoParse and than examines some fields of
PGconn structure.

Ah, you're right, I didn't remember that.


If I add some new uris, than only thing I can test is that comma is
properly copied from the URI to this field. And may be that some syntax
errors are properly detected.

Yeah, we should do that.


So, I think that new functionality need other approach for testing.
There should be test of real connection to real temporary cluster.
Probably, based on Perl TAP framework which is actively used in the
Postgres recently.

Yes, agreed.  So please have a look at that one and share your opinion
about it.  It'd be useful.

Meanwhile I'm moving the patch to the next commitfest.


There hasn't been any movement on this patch in a while.  Will you have 
a new tests ready for review soon?


Thanks,

--
-David
da...@pgmasters.net



Re: [HACKERS] Improving replay of XLOG_BTREE_VACUUM records

2016-03-09 Thread David Steele

On 1/8/16 9:34 AM, Alvaro Herrera wrote:

Simon Riggs wrote:

On 8 January 2016 at 13:36, Alvaro Herrera  wrote:

I would agree except for the observation on toast indexes.  I think
that's an important enough use case that perhaps we should have both.

The exclusion of toast indexes is something we can remove also, I have
recently discovered. When we access toast data we ignore MVCC, but we still
have the toast pointer and chunkid to use for rechecking our scan results.
So a later patch will add some rechecks.

Ah, interesting, glad to hear.  I take it you're pushing your patch
soon, then?


ISTM that this patch should be "returned with feedback" or "rejected" 
based on the thread.  I'm marking it "waiting for author" for the time 
being.


Thanks,

--
-David
da...@pgmasters.net



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statistics for array types

2016-03-09 Thread David Steele

On 1/18/16 11:24 AM, Alvaro Herrera wrote:

Alexander Korotkov wrote:


The patch implementing my idea above is attached.

What's the status here?  Jeff, did you have a look at Alexander's
version of your patch?  Tomas, does this patch satisfy your concerns?


This was marked as "needs review" but I have changed it to "waiting for 
author".


Thanks,
-David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-10 Thread David Rowley
On 9 March 2016 at 04:06, Robert Haas  wrote:
> On Mon, Mar 7, 2016 at 5:15 PM, David Rowley
>  wrote:
>> 3. Nothing in create_grouping_paths() looks at the force_parallel_mode
>> GUC. I had a quick look at this GUC and was a bit surprised to see 3
>> possible states, but no explanation of what they do, so I've not added
>> code which pays attention to this setting yet. I'd imagine this is
>> just a matter of skipping serial path generation when parallel is
>> possible when force_parallel_mode is FORCE_PARALLEL_ON. I've no idea
>> what FORCE_PARALLEL_REGRESS is for yet.
>
> The GUC is documented just like all the other GUCs are documented.
> Maybe that's not enough, but I don't think "no explanation of what
> they do" is accurate.  But I don't see why this patch should need to
> care about force_parallel_mode at all.  force_parallel_mode is about
> making queries that wouldn't choose to run in parallel do on their own
> do so anyway, whereas this patch is about making more queries able to
> do more work in parallel.

Hmm, it appears I only looked as far as the enum declaration, which I
expected to have something. Perhaps I'm just not used to looking up
the manual for things relating to code.

The one reason that I asked about force_parallel_mode was that I
assumed there was some buildfarm member running somewhere that
switches this on and runs the regression tests. I figured that if it
exists for other parallel features, then it probably should for this
too. Can you explain why you think this should be handled differently?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Identifying a message in emit_log_hook.

2016-03-10 Thread David Steele

Hi Simon,

On 3/10/16 7:26 AM, Simon Riggs wrote:


Can you add this to the CF? It was submitted before deadline.
I presume you have access to do that?


No problem - here it is:

https://commitfest.postgresql.org/9/576/

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Client Log Output Filtering

2016-03-10 Thread David Steele

On 3/9/16 7:37 PM, Petr Jelinek wrote:

On 03/02/16 05:02, Robert Haas wrote:

On Mon, Feb 1, 2016 at 7:24 PM, David Steele  wrote:


I have attached a patch that adds an ereport() macro to suppress client
output for a single report call (applies cleanly on 1d0c3b3).  I'll also
move it to the next CF.


I don't see any reason not to accept this.



Yes, the idea seems sane.

Looking at the code, this adds bool hide_from_client to edata which is
not initialized in errstart so that needs to be fixed.


I figured this would take care of it:

MemSet(edata, 0, sizeof(ErrorData));

Since I want hide_from_client to default to false.  Am I missing something?

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-10 Thread David Rowley
On 8 March 2016 at 11:15, David Rowley  wrote:
> The setrefs.c parts of the patch remain completely broken. I've not
> had time to look at this again yet, sorry.

Ok, so again, apologies for previously sending such a broken patch.
I've since managed to free up a bit of time to work on this, which now
consists of a good bit more than the sum total of my weekly lunch
hours.

The attached patch is a much more complete patch, and quite possible
now review worthy.

I set about solving the setrefs.c problem by inventing a PartialAggref
node type which wraps up Aggrefs when they're in a agg node which has
finalizeAggs = false. This node type exists all the way until executor
init time, where it's then removed and replaced with the underlying
Aggref. This seems to solve the targetlist return type issue.
I'd really like some feedback on this method of solving that problem.

I've also fixed numerous other bugs, including the HAVING clause problem.

Things left to do:
1. Make a final pass of the patch not at 3am.
2. Write some tests.
3. I've probably missed a few places that should handle T_PartialAggref.

A couple of things which I'm not 100% happy with.

1. make_partialgroup_input_target() doing lookups to the syscache.
Perhaps this job can be offloaded to a new function in a more suitable
location. Ideally the Aggref would already store the required
information, but I don't see a great place to be looking that up.
2. I don't really like how I had to add tlist to
create_grouping_paths(), but I didn't want to go to the trouble of
calculating the partial agg PathTarget if Parallel Aggregation is not
possible, as this does do syscache lookups, so it's not free, so I'd
rather only do it if we're actually going to add some parallel paths.
3. Something about the force_parallel_mode GUC. I'll think about this
when I start to think about how to test this, as likely I'll need
that, else I'd have to create tables bigger than we'd want to in the
regression tests.

I've also attached an .sql file with an aggregate function aimed to
test the new PartialAggref stuff works properly, as previously it
seemed to work by accident with sum(int).

Just some numbers to maybe make this more interesting:

create table t (num int not null, one int not null, ten int not null,
hundred int not null, thousand int not null, tenk int not null,
hundredk int not null, million int not null);
insert into t select
x,1,x%10+1,x%100+1,x%1000+1,x%1+1,x%10+1,x%100 from
generate_series(1,1000)x(x);

-- Serial Plan
# explain select sum(num) from t;
QUERY PLAN
---
 Aggregate  (cost=198530.00..198530.01 rows=1 width=8)
   ->  Seq Scan on t  (cost=0.00..173530.00 rows=1000 width=4)

# select sum(num) from t;
  sum

 500500
(1 row)
Time: 1036.119 ms

# set max_parallel_degree=4;

-- Parallel Plan
# explain select sum(num) from t;
  QUERY PLAN
--
 Finalize Aggregate  (cost=105780.52..105780.53 rows=1 width=8)
   ->  Gather  (cost=105780.00..105780.51 rows=5 width=8)
 Number of Workers: 4
 ->  Partial Aggregate  (cost=104780.00..104780.01 rows=1 width=8)
   ->  Parallel Seq Scan on t  (cost=0.00..98530.00
rows=250 width=4)
(5 rows)

# select sum(num) from t;
  sum

 500500
(1 row)

Time: 379.117 ms

I'll try and throw a bit parallel aggregate work to a 4 socket / 64
core server which I have access to... just for fun.

Reviews are now welcome.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


parallel_aggregation_cc3763e_2016-03-11.patch
Description: Binary data


parallel_aggregation_test.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Client Log Output Filtering

2016-03-11 Thread David Steele
On 3/10/16 11:00 AM, Petr Jelinek wrote:

> The comment above errhidefromclient says "Only log levels below ERROR
> can be hidden from the client." but use of the errhidefromclient(true)
> actually does hide the error message from client, client just gets
> failed query without any message when used with ERROR level.

Right you are.  The v3 patch adds this check.

I also added the documentation to sources.sgml that Tom pointed out was
missing.

Thanks,
-- 
-David
da...@pgmasters.net
diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml
index fcb3e40..4ad15d0 100644
--- a/doc/src/sgml/sources.sgml
+++ b/doc/src/sgml/sources.sgml
@@ -353,6 +353,14 @@ ereport(ERROR,
  includes the current statement already.
 

+   
+
+ errhidefromclient(bool hide_from_client) can be
+ called to suppress message output to the client when the error severity
+ level is lower than ERROR.  It is useful for hiding sensitive
+ messages from the client while still logging them on the server.
+
+   
   

 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 5b7554b..5e5035d 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1094,6 +1094,22 @@ errhidecontext(bool hide_ctx)
return 0;   /* return value does 
not matter */
 }
 
+/*
+ * errhidefromclient --- optionally suppress output of message to client
+ * if error severity level < ERROR.
+ */
+int
+errhidefromclient(bool hide_from_client)
+{
+   ErrorData  *edata = &errordata[errordata_stack_depth];
+
+   /* we don't bother incrementing recursion_depth */
+   CHECK_STACK_DEPTH();
+
+   edata->hide_from_client = hide_from_client && edata->elevel < ERROR;
+
+   return 0;   /* return value does 
not matter */
+}
 
 /*
  * errfunction --- add reporting function name to the current error
@@ -1477,7 +1493,7 @@ EmitErrorReport(void)
send_message_to_server_log(edata);
 
/* Send to client, if enabled */
-   if (edata->output_to_client)
+   if (edata->output_to_client && !edata->hide_from_client)
send_message_to_frontend(edata);
 
MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 7d338dd..05dfe2b 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -179,6 +179,7 @@ extern int  errcontext_msg(const char *fmt,...) 
pg_attribute_printf(1, 2);
 
 extern int errhidestmt(bool hide_stmt);
 extern int errhidecontext(bool hide_ctx);
+extern int errhidefromclient(bool hide_from_client);
 
 extern int errfunction(const char *funcname);
 extern int errposition(int cursorpos);
@@ -343,6 +344,7 @@ typedef struct ErrorData
boolshow_funcname;  /* true to force funcname inclusion */
boolhide_stmt;  /* true to prevent STATEMENT: 
inclusion */
boolhide_ctx;   /* true to prevent CONTEXT: 
inclusion */
+   boolhide_from_client;   /* true to prevent client 
output */
const char *filename;   /* __FILE__ of ereport() call */
int lineno; /* __LINE__ of 
ereport() call */
const char *funcname;   /* __func__ of ereport() call */


signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [PROPOSAL] Client Log Output Filtering

2016-03-11 Thread David Steele
On 3/10/16 11:07 AM, Tom Lane wrote:
> Petr Jelinek  writes:
>> The comment above errhidefromclient says "Only log levels below ERROR 
>> can be hidden from the client." but use of the errhidefromclient(true) 
>> actually does hide the error message from client, client just gets 
>> failed query without any message when used with ERROR level.
> 
> Um.  That seems pretty broken --- I think it's a violation of the wire
> protocol spec.
> 
> I notice though that we allow client_min_messages to be set to FATAL,
> which would be a different way of violating the protocol.  Maybe we
> should reduce the max setting of that to ERROR?

This was the same conclusion I came to for the log_level setting in pgaudit.

I'll submit a proposal to hackers after 9.6 to make this change.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [PROPOSAL] Client Log Output Filtering

2016-03-11 Thread David Steele
On 3/10/16 9:51 AM, Tom Lane wrote:

> The patch is evidently modeled on errhidestmt and errhidectx, which are
> making the same assumption for their fields.
> 
> I wonder whether, instead of continuing to proliferate random bool fields
> in struct ErrorData, we oughta replace them all with an "int flags" field.
> That's probably material for a separate patch though.

There are currently six boolean flags (if you include my new one) that
all relate to visibility in one way or another.  Combining them into a
"flags" field makes sense to me.

I'll submit a proposal to hackers after 9.6 to make this change.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Proposal: BSD Authentication support

2016-03-11 Thread David Steele
On 1/14/16 11:22 PM, Robert Haas wrote:
> On Tue, Jan 12, 2016 at 2:27 AM, Marisa Emerson  wrote:
>> I've attached the latest version of this patch. I've fixed up an issue with
>> the configuration scripts that I missed.
> Looks reasonable on a quick read-through.  Can anyone with access to a
> BSD system review and test?

Is anyone with access to/experience with BSD able to review and test
this patch?  Seems like it would make a great addition to 9.6.

Thanks,

-- 
-David
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Inconsistent error handling in START_REPLICATION command

2016-03-11 Thread David Steele
On 1/21/16 9:53 AM, Shulgin, Oleksandr wrote:
> On Thu, Jan 21, 2016 at 3:25 PM, Robert Haas  <mailto:robertmh...@gmail.com>> wrote:
>
>
> So it's true that the client can't unilaterally terminate COPY BOTH
> mode; it can only send CopyDone.  But an error on the server side
> should do so.
>
>
> Hm, you're right.  Even though the server sends COPY_BOTH message
> before the plugin startup, an attempt by the client to actually read
> the data messages results in ErrorResponse handled on the client, then
> the client can re-submit the corrected START_REPLICATION command and
> continue without the need to reconnect.  This cannot be actually
> tested in psql, but I can verify the behavior with my python scripts.
>
> Then I don't have a preference for the early error reporting in this
> case.  If the current behavior potentially allows for more flexible
> error reporting, I'm for keeping it.

It looks like a decision needs to be made here whether to apply this
patch, send it back to the author, or reject it so I'm marking it "Ready
for Committer".

Robert, since you were participating in this conversation can you have a
look?

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Combining Aggregates

2016-03-11 Thread David Steele
On 1/20/16 11:42 PM, David Rowley wrote:
> On 21 January 2016 at 08:06, Robert Haas  <mailto:robertmh...@gmail.com>> wrote:
> 
> On Wed, Jan 20, 2016 at 7:38 AM, David Rowley
> mailto:david.row...@2ndquadrant.com>>
> wrote:
> > Agreed. So I've attached a version of the patch which does not have any 
> of
> > the serialise/deserialise stuff in it.
> 
> I re-reviewed this and have committed most of it with only minor
> kibitizing.  A few notes:
> 
> 
> I've attached the re-based remainder, which includes the serial/deserial
> again.
> 
> I'll submit this part to March 'fest, where hopefully we'll also have
> something to utilise it.

As far as I can tell this is the most recent version of the patch but it
doesn't apply on master.  I'm guessing that's mostly because of Tom's
UPP patch.

This is a long and confusing thread and I should know since I just read
through the entire thing.  I'm setting this back to "waiting for author"
and I'd like see a rebased patch and a summary of what's in the patch
before setting it back to "needs review".

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-11 Thread David Steele
On 2/10/16 12:50 PM, Konstantin Knizhnik wrote:

> PostgresProffesional cluster teams wants to propose new version of
> eXtensible Transaction Manager API.
> Previous discussion concerning this patch can be found here:
> 
> http://www.postgresql.org/message-id/f2766b97-555d-424f-b29f-e0ca0f6d1...@postgrespro.ru

I see a lot of discussion on this thread but little in the way of consensus.

> The API patch itself is small enough, but we think that it will be
> strange to provide just API without examples of its usage.

It's not all that small, though it does apply cleanly even after a few
months.  At least that indicates there is not a lot of churn in this area.

I'm concerned about the lack of response or reviewers for this patch.
It may be because everyone believes they had their say on the original
thread, or because it seems like a big change to go into the last CF, or
for other reasons altogether.

I think you should try to make it clear why this patch would be a win
for 9.6.

Is anyone willing to volunteer a review or make an argument for the
importance of this patch?

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Inconsistent error handling in START_REPLICATION command

2016-03-11 Thread David Steele
On 3/11/16 1:11 PM, Tom Lane wrote:
> Robert Haas  writes:
>> On Fri, Mar 11, 2016 at 11:36 AM, David Steele  wrote:
>>> It looks like a decision needs to be made here whether to apply this patch,
>>> send it back to the author, or reject it so I'm marking it "Ready for
>>> Committer".
>>>
>>> Robert, since you were participating in this conversation can you have a
>>> look?
> 
>> Who, me?  *looks around*
>>
>> OK, I had a look.  I don't think this is a bug.
> 
> I agree, and Craig's complaint that this change reduces flexibility
> for plugins seems to tilt the balance against it.  I see that Alex
> himself accepted that argument in his last message in the thread
> ().
>
> So let's mark it rejected and move on.

+1 and done.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] remove wal_level archive

2016-03-11 Thread David Steele
On 2/8/16 2:34 PM, Alvaro Herrera wrote:
> Peter Eisentraut wrote:
>> On 1/26/16 10:56 AM, Simon Riggs wrote:
>>> Removing one of "archive" or "hot standby" will just cause confusion and
>>> breakage, so neither is a good choice for removal.
>>>
>>> What we should do is 
>>> 1. Map "archive" and "hot_standby" to one level with a new name that
>>> indicates that it can be used for both/either backup or replication.
>>>   (My suggested name for the new level is "replica"...)
>>> 2. Deprecate "archive" and "hot_standby" so that those will be removed
>>> in a later release.
>>
>> Updated patch to reflect these suggestions.
> 
> I wonder if the "keep one / keep both" argument is running in circles as
> new reviewers arrive at the thread.  Perhaps somebody could read the
> whole thread(s) and figure out a way to find consensus so that we move
> forward on this.

There was a lot of argument upstream about whether to keep 'hot_standby'
or 'archive' but after the proposal to change it to 'replica' came up
everybody seemed to fall in line with that.

+1 from me for using 'replica' as the WAL level to replace 'hot_standby'
and 'archive'.

+1 from me for removing the 'hot_standby' and 'archive' options entirely
in 9.6 rather than deprecating.

Unless anyone has objections I would like to mark this 'ready for
committer'.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-11 Thread David Steele
On 3/11/16 1:30 PM, Robert Haas wrote:

> There's been a lot of discussion on another thread about this patch.
> The subject is "The plan for FDW-based sharding", but the thread kind
> of got partially hijacked by this issue.  The net-net of that is that
> I don't think we have a clear enough idea about where we're going with
> global transaction management to make it a good idea to adopt an API
> like this.  For example, if we later decide we want to put the
> functionality in core, will we keep the hooks around for the sake of
> alternative non-core implementations?  I just don't believe this
> technology is nearly mature enough to commit to at this point.

Ah yes, I forgot about the related discussion on that thread.  Pasting
here for reference:

http://www.postgresql.org/message-id/20160223164335.ga11...@momjian.us

> Konstantin does not agree with my assessment, perhaps unsurprisingly.

I'm certainly no stranger to feeling strongly about a patch!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] proposal: make NOTIFY list de-duplication optional

2016-03-11 Thread David Steele
Hi Filip,

On 2/20/16 8:00 AM, Filip Rembiałkowski wrote:
> On Fri, Feb 19, 2016 at 10:09 PM, Catalin Iacob  On 2/9/16, Tom Lane mailto:t...@sss.pgh.pa.us>>
> wrote:
> > FWIW, I think it would be a good thing if the NOTIFY statement syntax 
> were
> > not remarkably different from the syntax used in the pg_notify() 
> function
> > call.  To do otherwise would certainly be confusing.  So on the whole
> > I'd go with the "NOTIFY channel [ , payload [ , mode ] ]" option.
> 
> Filip, do you agree with Tom's proposal? Do you plan to rework the
> patch on these lines? If you are I'll try to review it, if not I could
> give it a shot as I'm interested in having this in 9.6.
> 
> I see that Tom's remarks give more flexibility, and your refinement
> makes sense.

It looks like we are waiting on a new patch from you before this can be
reviewed.  Are you close to having that done?

Meanwhile, I have marked it "Waiting on Author".

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-11 Thread David Steele
On 3/11/16 2:00 PM, Oleg Bartunov wrote:
> On Fri, Mar 11, 2016 at 7:11 PM, David Steele  I'm concerned about the lack of response or reviewers for this patch.
> It may be because everyone believes they had their say on the original
> thread, or because it seems like a big change to go into the last CF, or
> for other reasons altogether.
> 
> 
> We'll prepare easy setup to play with our solutions, so any developers
> could see how it works.  Hope this weekend we'll post something about this.

OK, then for now I'm marking this "waiting for author."  You can switch
it back to "needs review" once you have posted additional material.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] insufficient qualification of some objects in dump files

2016-03-11 Thread David Steele
Hi Peter,

On 2/26/16 1:30 AM, Tom Lane wrote:
> Michael Paquier  writes:
>> On Fri, Feb 26, 2016 at 9:47 AM, Peter Eisentraut  wrote:
>>> Tom thought this might require an archive version dump, but I'm not
>>> sure.  The tags are more of an informational string for human
>>> consumption, not strictly part of the archive format.
> 
>> Hm, the TOC entry, with its tag changed, is part of the dump, and this
>> is written in the archive, but the shape of TocEntry does not change
>> so this is really debatable.
> 
> I had in mind that we would add a separate field for tag's schema name to
> TocEntry, which surely would require an archive format number bump.
> As the patch is presented, I agree with Peter that it does not really
> need a format number bump.  The question that has to be answered is
> whether this solution is good enough?  You could not trust it for
> automated processing of tags --- it's easy to think of cases in which the
> schema/object name separation would be ambiguous.  So is the tag really
> "strictly for human consumption"?  I'm not sure about that.

It looks like there is still some discussion to be had here about
whether a "human readable" solution is enough.

Until that's resolved I've marked this patch "Waiting on Author".

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] 2016-03 Commitfest

2016-03-11 Thread David Steele
We're are now one third of the way through the 2016-03 Commitfest.

There are still some patches left that need review but have no reviewer
(https://commitfest.postgresql.org/9/?reviewer=-2) though a lot have
been picked up in the last week.

Needs review: 56
Needs *reviewer*: 15 (was 58 last week)

There were a number of patches that were marked as "needs review" but
were actually "waiting for author" as far as I could see.  If your patch
is in this state it would be good if you could give an idea when you
will be able to supply a new patch or otherwise address concerns raised
in the thread.

Waiting on Author: 29

The closed patches are up significantly since last week and the good
news is that most of them were committed.  36% of the patches are now
closed.

Committed: 46
Rejected: 4
Returned with Feedback: 4

I will continue to respond individually to threads that seem idle or
have issues that need to be resolved.  Please do not hesitate to contact
me about about any concerns that you might have regarding the Commitfest
process.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Parallel Aggregate

2016-03-11 Thread David Rowley
On 11 March 2016 at 03:39, David Rowley  wrote:
> A couple of things which I'm not 100% happy with.
>
> 1. make_partialgroup_input_target() doing lookups to the syscache.
> Perhaps this job can be offloaded to a new function in a more suitable
> location. Ideally the Aggref would already store the required
> information, but I don't see a great place to be looking that up.

I've made some changes and moved this work off to a new function in tlist.c.

> 2. I don't really like how I had to add tlist to
> create_grouping_paths(), but I didn't want to go to the trouble of
> calculating the partial agg PathTarget if Parallel Aggregation is not
> possible, as this does do syscache lookups, so it's not free, so I'd
> rather only do it if we're actually going to add some parallel paths.

This is now fixed. The solution was much easier after 49635d7b.

> 3. Something about the force_parallel_mode GUC. I'll think about this
> when I start to think about how to test this, as likely I'll need
> that, else I'd have to create tables bigger than we'd want to in the
> regression tests.

On further analysis it seems that this GUC does not do what I thought
it did, which will be why Robert said that I don't need to think about
it here. The GUC just seems to add a Gather node at the base of the
plan tree, when possible. Which leaves me a bit lost when it comes to
how to write tests for this... It seems like I need to add at least
136k rows to a test table to get a Parallel Aggregate plan, which I
think is a no-go for the regression test suite... that's with
parallel_setup_cost=0;

It would be nice if that GUC just, when enabled, preferred the
cheapest parallel plan (when available), rather than hacking in a
Gather node into the plan's root. This should have the same result in
many cases anyway, and would allow me to test this without generating
oversized tables in the regression tests.

I've attached an updated patch which is based on commit 7087166,
things are really changing fast in the grouping path area at the
moment, but hopefully the dust is starting to settle now.

This patch also fixes a couple of bugs, one in the cost estimates for
the number of groups that will be produced by the final aggregate,
also there was a missing copyObject() in the setrefs.c code which
caused a Var not found in targetlist problem in setrefs.c for plans
with more than 1 partial aggregate node... I had to modify the planner
to get it to add an additional aggregate node to test this (separate
test patch for this is attached).

Comments/Reviews/Testing all welcome.

Thanks

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


parallel_aggregation_8a26089_2016-03-12.patch
Description: Binary data


parallel_aggregation_extra_combine_node.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread David Rowley
On 12 March 2016 at 11:43, Tom Lane  wrote:
> I wrote:
>> I wondered why, instead of inventing an extra semantics-modifying flag,
>> we couldn't just change the jointype to *be* JOIN_SEMI when we've
>> discovered that the inner side is unique.
>
> BTW, to clarify: I'm not imagining that we'd make this change in the
> query jointree, as for example prepjointree.c might do.  That would appear
> to add join order constraints, which we don't want.  But I'm thinking that
> at the instant where we form a join Path, we could change the Path's
> jointype to be JOIN_SEMI or JOIN_SEMI_OUTER if we're able to prove the
> inner side unique, rather than annotating the Path with a separate flag.
> Then that representation is what propagates forward.

Thanks for looking at this.

Yes that might work, since we'd just be changing the jointype in the
JoinPath, if that path was discarded if favour of, say the commutative
variant, which was not "unique inner", then it shouldn't matter, as
the join type for that path would be the original one.

The thing that might matter is that, this;


explain (costs off) select * from t1 inner join t2 on t1.id=t2.id
 QUERY PLAN
--
 Hash Join
   Hash Cond: (t1.id = t2.id)
   ->  Seq Scan on t1
   ->  Hash
 ->  Seq Scan on t2

could become;

  QUERY PLAN
--
 Hash Semi Join
   Hash Cond: (t1.id = t2.id)
   ->  Seq Scan on t1
   ->  Hash
 ->  Seq Scan on t2

Wouldn't that cause quite a bit of confusion? People browsing EXPLAIN
output might be a bit confused at the lack of EXISTS/IN clause in a
query which is showing a Semi Join. Now, we could get around that by
adding JOIN_SEMI_INNER I guess, and just displaying that as a normal
inner join, yet it'll behave exactly like JOIN_SEMI!

And if we do that, then the join node code changes from;

if (node->js.match_first_tuple_only)
  node->nl_NeedNewOuter = true;

to;

if (node->js.jointype == JOIN_SEMI || node->js.jointype ==
JOIN_SEMI_INNER || node->js.jointype == JOIN_SEMI_LEFT)
  node->nl_NeedNewOuter = true;

which is kinda horrid and adds a few cycles and code without a real
need for it. If we kept the match_first_tuples_only and set it in the
node startup similar to how it is now, or changed JoinType to be some
bitmask mask that had a bit for each piece of a venn diagram, and an
extra for the unique inner... but I'm not so sure I like that idea...

To me it seems neater just to keep the match_first_tuple_only and just
update the planner stuff to use the new jointypes.

Thoughts?

>
> It seems like the major intellectual complexity here is to figure out
> how to detect inner-side-unique at reasonable cost.  I see that for
> LEFT joins you're caching that in the SpecialJoinInfos, which is probably
> fine.  But for INNER joins it looks like you're just doing it over again
> for every candidate join, and that seems mighty expensive.

hmm yeah, perhaps something can be done to cache that, I'm just not
all that sure how much reuse the cache will get used since it
generates the Paths for nestloop/merge/has join methods all at once,
once the unique_inner has been determined for that inner rel, and the
restrict info for whatever the outer rel(s) are. I didn't expect that
to happen twice, so I'm not all that sure if a cache will get
reused...

... thinks more ...

Perhaps maybe if rel x was found to be unique with the join conditions
of rel y, then we can be sure that x is unique against y, z, as that
could only possible add more to the join condition, never less. Is
this what you meant?

... I'll look into that.

The other thing I thought of was to add a dedicated list for unique
indexes in RelOptInfo, this would also allow
rel_supports_distinctness() to do something a bit smarter than just
return false if there's no indexes. That might not buy us much though,
but at least relations tend to have very little unique indexes, even
when they have lots of indexes.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-13 Thread David Rowley
On 12 March 2016 at 11:43, Tom Lane  wrote:
> I wrote:
>> I wondered why, instead of inventing an extra semantics-modifying flag,
>> we couldn't just change the jointype to *be* JOIN_SEMI when we've
>> discovered that the inner side is unique.
>
> BTW, to clarify: I'm not imagining that we'd make this change in the
> query jointree, as for example prepjointree.c might do.  That would appear
> to add join order constraints, which we don't want.  But I'm thinking that
> at the instant where we form a join Path, we could change the Path's
> jointype to be JOIN_SEMI or JOIN_SEMI_OUTER if we're able to prove the
> inner side unique, rather than annotating the Path with a separate flag.
> Then that representation is what propagates forward.

I've attached a patch which implements this, although I did call the
new join type JOIN_LEFT_UNIQUE rather than JOIN_SEMI_OUTER.
I'm not all that confident that I've not added handling for
JOIN_LEFT_UNIQUE in places where it's not possible to get that join
type, I did leave out a good number of places where I know it's not
possible. I'm also not sure with too much certainty that I've got all
cases correct, but the regression tests pass. The patch is more
intended for assisting discussion than as a ready to commit patch.

> It seems like the major intellectual complexity here is to figure out
> how to detect inner-side-unique at reasonable cost.  I see that for
> LEFT joins you're caching that in the SpecialJoinInfos, which is probably
> fine.  But for INNER joins it looks like you're just doing it over again
> for every candidate join, and that seems mighty expensive.

I have a rough idea for this, but I need to think of it a bit more to
make sure it's bulletproof.
I also just noticed (since it's been a while since I wrote the patch)
that in add_paths_to_joinrel() that innerrel can naturally be a
joinrel too, and we can fail to find uniqueness in that joinrel. I
think it should be possible to analyse the join rel too and search for
a base rel which supports the distinctness, and then ensure none of
the other rels which make up the join rel can cause tuple duplication
of that rel. But this just causes missed optimisation opportunities.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


unique_joins_82d2a07_2016-03-14.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 12 March 2016 at 16:31, David Rowley  wrote:
> I've attached an updated patch which is based on commit 7087166,
> things are really changing fast in the grouping path area at the
> moment, but hopefully the dust is starting to settle now.

The attached patch fixes a harmless compiler warning about a possible
uninitialised variable.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


parallel_aggregation_015971a_2016-03-14.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 14:16, James Sewell  wrote:

> I've done some testing with one of my data sets in an 8VPU virtual
> environment and this is looking really, really good.
>
> My test query is:
>
> SELECT pageview, sum(pageview_count)
> FROM fact_agg_2015_12
> GROUP BY date_trunc('DAY'::text, pageview);
>
> The query returns 15 rows. The fact_agg table is 5398MB and holds around
> 25 million records.
>
> Explain with a max_parallel_degree of 8 tells me that the query will only
> use 6 background workers. I have no indexes on the table currently.
>
> Finalize HashAggregate  (cost=810142.42..810882.62 rows=59216 width=16)
>Group Key: (date_trunc('DAY'::text, pageview))
>->  Gather  (cost=765878.46..808069.86 rows=414512 width=16)
>  Number of Workers: 6
>  ->  Partial HashAggregate  (cost=764878.46..765618.66 rows=59216
> width=16)
>Group Key: date_trunc('DAY'::text, pageview)
>->  Parallel Seq Scan on fact_agg_2015_12
>  (cost=0.00..743769.76 rows=4221741 width=12)
>

Great! Thanks for testing this.

If you run EXPLAIN ANALYZE on this with the 6 workers, does the actual
number of Gather rows come out at 105? I'd just like to get an idea of my
cost estimate for the Gather are going to be accurate for real world data
sets.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 14:52, James Sewell  wrote:
> One question - how is the upper limit of workers chosen?

See create_parallel_paths() in allpaths.c. Basically the bigger the
relation (in pages) the more workers will be allocated, up until
max_parallel_degree.

There is also a comment in that function which states:
/*
* Limit the degree of parallelism logarithmically based on the size of the
* relation.  This probably needs to be a good deal more sophisticated, but we
* need something here for now.
*/

So this will likely see some revision at some point, after 9.6.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 16:39, James Sewell  wrote:
>
> I've been testing how this works with partitioning (which seems to be 
> strange, but I'll post separately about that) and something odd seems to be 
> going on now with the parallel triggering:
>
> postgres=# create table a as select * from base_p2015_11;
> SELECT 2000
>
> postgres=# explain  select sum(count) from a group by date_trunc('DAY',ts);
>   QUERY PLAN
> --
>  Finalize GroupAggregate  (cost=218242.96..218254.46 rows=200 width=16)
>Group Key: (date_trunc('DAY'::text, ts))
>->  Sort  (cost=218242.96..218245.96 rows=1200 width=16)
>  Sort Key: (date_trunc('DAY'::text, ts))
>  ->  Gather  (cost=218059.08..218181.58 rows=1200 width=16)
>Number of Workers: 5
>->  Partial HashAggregate  (cost=217059.08..217061.58 rows=200 
> width=16)
>  Group Key: date_trunc('DAY'::text, ts)
>  ->  Parallel Seq Scan on a  (cost=0.00..197059.06 
> rows=405 width=12)
> (9 rows)
>
> postgres=# analyze a;
>
> postgres=# explain  select sum(count) from a group by date_trunc('DAY',ts);
> QUERY PLAN
> --
>  GroupAggregate  (cost=3164211.55..3564212.03 rows=2024 width=16)
>Group Key: (date_trunc('DAY'::text, ts))
>->  Sort  (cost=3164211.55..3214211.61 rows=2024 width=12)
>  Sort Key: (date_trunc('DAY'::text, ts))
>  ->  Seq Scan on a  (cost=0.00..397059.30 rows=2024 width=12)
> (5 rows)
>
> Unsure what's happening here.

This just comes down to the fact that PostgreSQL is quite poor at
estimating the number of groups that will be produced by the
expression date_trunc('DAY',ts). Due to lack of stats when you run the
query before ANALYZE, PostgreSQL just uses a hardcoded guess of 200,
which it thinks will fit quite nicely in the HashAggregate node's hash
table. After you run ANALYZE this estimate goes up to 2024, and
the grouping planner thinks that's a little to much be storing in a
hash table, based on the size of your your work_mem setting, so it
uses a Sort plan instead.

Things to try:
1. alter table a add column ts_date date; update a set ts_date =
date_trunc('DAY',ts); vacuum full analyze ts;
2. or, create index on a (date_trunc('DAY',ts)); analyze a;
3. or for testing, set the work_mem higher.

So, basically, it's no fault of this patch. It's just there's no real
good way for the planner to go estimating something like
date_trunc('DAY',ts) without either adding a column which explicitly
stores that value (1), or collecting stats on the expression (2), or
teaching the planner about the internals of that function, which is
likely just never going to happen. (3) is just going to make the
outlook of a hash plan look a little brighter, although you'll likely
need a work_mem of over 1GB to make the plan change.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-13 Thread David Rowley
On 14 March 2016 at 17:05, James Sewell  wrote:
>
> Hi again,
>
> I've been playing around with inheritance combined with this patch. Currently 
> it looks like you are taking max(parallel_degree) from all the child tables 
> and using that for the number of workers.
>
> For large machines it makes much more sense to use sum(parallel_degree) - but 
> I've just seen this comment in the code:
>
> /*
>  * Decide what parallel degree to request for this append path.  For
>  * now, we just use the maximum parallel degree of any member.  It
>  * might be useful to use a higher number if the Append node were
>  * smart enough to spread out the workers, but it currently isn't.
>  */
>
> Does this mean that even though we are aggregating in parallel, we are only 
> operating on one child table at a time currently?

There is nothing in the planner yet, or any patch that I know of to
push the Partial Aggregate node to below an Append node. That will
most likely come in 9.7.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove wal_level archive

2016-03-14 Thread David Steele

On 3/11/16 1:29 PM, David Steele wrote:


Unless anyone has objections I would like to mark this 'ready for
committer'.


This patch is now ready for committer.

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch update of indexes

2016-03-14 Thread David Steele

Hi Konstantin,

On 2/3/16 11:47 AM, Konstantin Knizhnik wrote:

Attached please find patch for "ALTER INDEX ... WHERE ..." clause.
It is now able to handle all three possible situations:
1. Making index partial (add WHERE condition to the ordinary index)
2. Extend partial index range (less restricted index predicate)
3. Arbitrary change of partial index predicate

In case 2) new records are added to the index.
In other two cases index is completely reconstructed.

This patch includes src/bin/insbench utility for testing insert
performance. It can be easily excluded from the patch to reduce it size.
Also it is better to apply this patch together with "index-only scans
with partial indexes" patch:


This patch no longer applies on master:

$ git apply ../other/alter-index.patch
../other/alter-index.patch:27: trailing whitespace.
static void
../other/alter-index.patch:62: indent with spaces.
SPIPlanPtr plan;
../other/alter-index.patch:63: indent with spaces.
Portal portal;
../other/alter-index.patch:90: trailing whitespace.

../other/alter-index.patch:99: trailing whitespace.

error: patch failed: src/test/regress/expected/aggregates.out:831
error: src/test/regress/expected/aggregates.out: patch does not apply

Please provide a new patch for review.  Meanwhile I am marking this 
"waiting on author".


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2016-03-14 Thread David Steele

On 2/18/16 6:54 AM, Kyotaro HORIGUCHI wrote:


First, I rebased the previous patch set and merged three of
them. Now they are of three patches.


1. Making SQL parser part of psqlscan independent from psql.

Moved psql's baskslsh command stuff out of original psqlscan.l
and some psql stuff the parser directly looked are used via a
set of callback functions, which can be all NULL for usages
from other than psql.

2. Making pgbench to use the new psqlscan parser.

3. Changing the way to hold SQL/META commands from array to
 linked list.

 The #2 introduced linked list to store SQL multistatement but
 immediately the caller moves the elements into an array. This
 patch totally changes the way to linked list.


Any takers to review this updated patch?

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-03-14 Thread David Steele

On 2/18/16 10:10 AM, Constantin S. Pan wrote:

On Wed, 17 Feb 2016 23:01:47 +0300
Oleg Bartunov  wrote:


My feedback is (Mac OS X 10.11.3)

set gin_parallel_workers=2;
create index message_body_idx on messages using gin(body_tsvector);
LOG:  worker process: parallel worker for PID 5689 (PID 6906) was
terminated by signal 11: Segmentation fault


Fixed this, try the new patch. The bug was in incorrect handling
of some GIN categories.


Oleg, it looks like Constantin has updated to patch to address the issue 
you were seeing.  Do you have time to retest and review?


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2016-03-14 Thread David Steele

Hi Anastasia,

On 2/18/16 12:29 PM, Anastasia Lubennikova wrote:

18.02.2016 20:18, Anastasia Lubennikova:

04.02.2016 20:16, Peter Geoghegan:

On Fri, Jan 29, 2016 at 8:50 AM, Anastasia Lubennikova
  wrote:

I fixed it in the new version (attached).


Thank you for the review.
At last, there is a new patch version 3.0. After some refactoring it
looks much better.
I described all details of the compression in this document
https://goo.gl/50O8Q0 (the same text without pictures is attached in
btc_readme_1.0.txt).
Consider it as a rough copy of readme. It contains some notes about
tricky moments of implementation and questions about future work.
Please don't hesitate to comment it.


Sorry, previous patch was dirty. Hotfix is attached.


This looks like an extremely valuable optimization for btree indexes but 
unfortunately it is not getting a lot of attention.  It still applies 
cleanly for anyone interested in reviewing.


It's not clear to me that you answered all of Peter's questions in [1]. 
 I understand that you've provided a README but it may not be clear if 
the answers are in there (and where).


Also, at the end of the README it says:

13. Xlog. TODO.

Does that mean the patch is not yet complete?

Thanks,
--
-David
da...@pgmasters.net

[1] 
http://www.postgresql.org/message-id/cam3swzq3_plqch4w7uq8q_f2t4hesektr2n0rq5pxa18oer...@mail.gmail.com



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2016-03-14 Thread David Steele

Hi Thomas,

On 2/24/16 11:21 AM, Tomas Vondra wrote:


Overall, I still believe the FK patch is a clear improvement of the
current status - while it's true it does not improve all possible cases
and there's a room for additional improvements (like handling multiple
candidate FK constraints), it should not make existing estimates worse.


The latest version of this patch does not apply:

$ git apply ../other/0001-estimation-with-fkeys-v2.patch
../other/0001-estimation-with-fkeys-v2.patch:748: trailing whitespace.

error: patch failed: src/backend/optimizer/util/plancat.c:27
error: src/backend/optimizer/util/plancat.c: patch does not apply
error: patch failed: src/include/nodes/relation.h:468
error: src/include/nodes/relation.h: patch does not apply

David's most recent version also does not apply:

$ git apply ../other/estimation-with-fkeys-v2_davidv4.patch
../other/estimation-with-fkeys-v2_davidv4.patch:517: trailing whitespace.

error: patch failed: src/backend/optimizer/util/plancat.c:27
error: src/backend/optimizer/util/plancat.c: patch does not apply
error: patch failed: src/include/nodes/relation.h:472
error: src/include/nodes/relation.h: patch does not apply

I don't think it would be clear to any reviewer which patch to apply 
even if they were working.  I'm marking this "waiting for author".


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check

2016-03-14 Thread David Steele

On 2/25/16 4:44 PM, Vitaly Burovoy wrote:


Added to the commitfest 2016-03.

[CF] https://commitfest.postgresql.org/9/540/


This looks like a fairly straight-forward bug fix (the size of the patch 
is deceptive because there a lot of new tests included).  It applies 
cleanly.


Anastasia, I see you have signed up to review.  Do you have an idea when 
you will get the chance to do that?


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch update of indexes

2016-03-14 Thread David Steele

On 3/14/16 10:28 AM, Konstantin Knizhnik wrote:


Rebased patch is attached.


Thanks for the quick turnaround!

Marko, you are signed up to review this patch.  Do you have an idea of 
when you'll be able to do that?


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-03-14 Thread David Steele

On 2/23/16 2:17 AM, Michael Paquier wrote:


As a continuation of the thread firstly dedicated to SCRAM:
http://www.postgresql.org/message-id/55192afe.6080...@iki.fi
Here is a new thread aimed at gathering all the ideas of this previous
thread and aimed at clarifying a bit what has been discussed until now
regarding password protocols, verifiers, and SCRAM itself.


It looks like this patch set is a bit out of date.

When applying 0004:

$ git apply 
../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch

error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262
error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply

Then I tried to build with just 0001-0003:

cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids
3318
3319
3320
3321
3322
make[3]: *** [postgres.bki] Error 1

Could you provide an updated set of patches for review?  Meanwhile I am 
marking this as "waiting for author".


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-14 Thread David Steele

Hi Jeff,

On 2/25/16 5:00 PM, Jeff Janes wrote:


But, It doesn't sound like I am going to win that debate.  Given that,
I don't think we need a different name for the function. I'm fine with
explaining the word-boundary subtlety in the documentation, and
keeping the function name itself simple.


It's not clear to me if you are requesting more documentation here or 
stating that you are happy with it as-is.  Care to elaborate?


Other than that I think this patch looks to be ready for committer. Any 
objections?


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2016-03-14 Thread David Steele

On 2/24/16 12:40 AM, Michael Paquier wrote:


This has the merit to be clear, thanks for the input. Whatever the
approach taken at the end we have two candidates:
- Extend XLogInsert() with an extra argument for flags (Andres)
- Introduce XLogInsertExtended with this extra argument and let
XLogInsert() in peace (Robert and I).
Actually, I lied, there was still something I could do for this
thread: attached are two patches implementing both approaches as
respectively a-1 and a-2. Patch b is the set of logs I used for the
tests to show with a low checkpoint_timeout that checkpoints are
getting correctly skipped on an idle system.


Unfortunately neither A nor B apply anymore.

However, since the patches can still be read through I wonder if Robert 
or Andres would care to opine on whether A or B is better now that they 
can see the full implementation?


For my 2c I'm happy with XLogInsertExtended() since it seems to be a 
rare use case where flags are required.  This can always be refactored 
in the future if/when the use of flags spreads.


I think it would be good to make a decision on this before asking 
Michael to rebase.


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index

2016-03-14 Thread David Steele

On 2/15/16 10:29 AM, Teodor Sigaev wrote:


It's very pity but author is not able to continue work on this patch,
and I would like to raise this flag.

I'd like to add some comments about patches:

traversalValue patch adds arbitrary value assoсiated with branch in
SP-GiST tree walk. Unlike to recostructedValue it could be just pointer,
it havn't to be a regular pgsql type. Also, it could be used independly
from reconstructedValue. This patch is used both following attached
patches.

range patch just switchs using reconstructedValue to traversalValue in
range opclasses. reconstructedValue was used just because it was an
acceptable workaround in case of range type. Range opclase stores a full
value in leafs and doesn't need to use reconstructedValue to return
tuple in index only scan.
See http://www.postgresql.org/message-id/5399.1343512...@sss.pgh.pa.us

q4d patch implements index over boxes using SP-GiST. Basic idea was an
observation, range opclass thinks about one-dimensional ranges as 2D
points.
Following this idea, we can think that 2D box (what is 2 points or 4
numbers) could represent a 4D point. We hoped that this approach will be
much more effective than traditional R-Tree in case of many overlaps in
box's collection.
Performance results are coming shortly.


It appears that the issues raised in this thread have been addressed but 
the patch still has not gone though a real review.


Anybody out there willing to take a crack at a review?  All three 
patches apply (with whitespace issues).


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dealing with extension dependencies that aren't quite 'e'

2016-03-14 Thread David Steele

Hi Abhijit,

On 3/1/16 8:36 AM, Jim Nasby wrote:

On 2/29/16 10:33 PM, Abhijit Menon-Sen wrote:

>Given the audience for this, I think it'd probably be OK to just
>provide a function that does this, instead of DDL.

That seems like a promising idea. Can you suggest some possible usage?


pg_extension_dependency( regextension, any )

where "any" would be all the other reg* types. That should be a lot less
work to code up than messing with the grammar.


So where are we on this now?  Were you going to implement this as a 
function the way Jim suggested?


Alexander, you are signed up to review.  Any opinion on which course is 
best?


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-03-14 Thread David Steele

On 2/26/16 11:37 PM, Amit Kapila wrote:


On Sat, Feb 27, 2016 at 10:03 AM, Amit Kapila 

This patch no longer applies cleanly:

$ git apply ../other/group_update_clog_v6.patch
error: patch failed: src/backend/storage/lmgr/proc.c:404
error: src/backend/storage/lmgr/proc.c: patch does not apply
error: patch failed: src/include/storage/proc.h:152
error: src/include/storage/proc.h: patch does not apply

It's not clear to me whether Robert has completed a review of this code 
or it still needs to be reviewed more comprehensively.


Other than a comment that needs to be fixed it seems that all questions 
have been answered by Amit.


Is this "ready for committer" or still in need of further review?

--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2016-03-14 Thread David Steele
Hi Fabien,

On 3/14/16 3:27 PM, Fabien COELHO wrote:

>> Any takers to review this updated patch?
> 
> I intend to have a look at it, I had a look at a previous instance, but
> I'm ok if someone wants to proceed.

There's not exactly a long line of reviewers at the moment so if you
could do a followup review that would be great.

Thanks,
-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-14 Thread David Steele
On 3/14/16 4:10 PM, Robbie Harwood wrote:
> David Steele  writes:
> 
>> Hi Robbie,
>>
>> On 3/8/16 5:44 PM, Robbie Harwood wrote:
>>> Hello friends,
>>>
>>> Here's yet another version of GSSAPI encryption support.  It's also
>>> available for viewing on my github:
>>
>> The build went fine but when testing I was unable to logon at all.  I'm
>> using the same methodology as in
>> http://www.postgresql.org/message-id/56be0ff9.70...@pgmasters.net except
>> that I'm running against 51c0f63 and using the v6 patch set.
>>
>> psql simply hangs and never returns.  I have attached a pcap of the
>> psql/postgres session generated with:
>>
>> tcpdump -i lo -nnvvXSs 1514 port 5432 -w gssapi.pcap
>>
>> If you would like me to capture more information please let me know
>> specifically how you would like me to capture it.
> 
> Please disregard my other email.  I think I've found the issue; will
> post a new version in a moment.

Strange timing since I was just testing this.  Here's what I got:

$ pg/bin/psql -h localhost -U vagr...@pgmasters.net postgres
conn->inStart = 179, conn->inEnd = 179, conn->inCursor = 179
psql (9.6devel)
Type "help" for help.

postgres=>

This was after commenting out:

// appendBinaryPQExpBuffer(&conn->gwritebuf,
//  conn->inBuffer + conn->inStart,
//  conn->inEnd - conn->inStart);
// conn->inEnd = conn->inStart;

The good news I can log on every time now!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-14 Thread David Steele
On 3/14/16 7:20 PM, Robbie Harwood wrote:

> David Steele  writes:
>
>>
>> Strange timing since I was just testing this.  Here's what I got:
>>
>> $ pg/bin/psql -h localhost -U vagr...@pgmasters.net postgres
>> conn->inStart = 179, conn->inEnd = 179, conn->inCursor = 179
>> psql (9.6devel)
>> Type "help" for help.
>>
>> postgres=>
> 
> Thanks, that certainly is interesting!  I did finally manage to
> reproduce the issue on my end, but the rate of incidence is much lower
> than what you and Michael were seeing: I have to run connections in a
> loop for about 10-20 minutes before it makes itself apparent (and no,
> it's not due to entropy).  Apparently I just wasn't patient enough.

I'm running client and server on the same VM - perhaps that led to less
latency than your setup?

> All that is to say: thank you very much for investigating that!

My pleasure!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread David Rowley
== true) when it's a boolean.
> Similarly !a rather than a == false.

hmm, thanks. It appears that I've not been all that consistent in that
area. I didn't know that was convention. I see that some of my way
have crept into the explain.c changes already :/

I will send an updated patch once I address Tomas' concerns too.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread David Rowley
On 15 March 2016 at 11:39, Tomas Vondra  wrote:
> I've looked at this patch today. The patch seems quite solid, but I do have
> a few minor comments (or perhaps questions, given that this is the first
> time I looked at the patch).
>
> 1) exprCollation contains this bit:
> ---
>
> case T_PartialAggref:
> coll = InvalidOid; /* XXX is this correct? */
> break;
>
> I doubt this is the right thing to do. Can we actually get to this piece of
> code? I haven't tried too hard, but regression tests don't seem to trigger
> this piece of code.

Thanks for looking at this.

Yeah, it's there because it it being called during setrefs.c in
makeVarFromTargetEntry() via fix_combine_agg_expr_mutator(), so it's
required when building the new target list for Aggref->args to point
to the underlying Aggref's Var.

As for the collation, I'm still not convinced if it's right or wrong.
I know offlist you mentioned about string_agg() and sorting, but
there' no code that'll sort the agg's state. The only possible thing
that gets sorted there is the group by key.

> Moreover, if we're handling PartialAggref in exprCollation(), maybe we
> should handle it also in exprInputCollation and exprSetCollation?

hmm, maybe, that I'm not sure about. I don't see where we'd call
exprSetCollation() for this, but I think I need to look at
exprInputCollation()

> And if we really need the collation there, why not to fetch the actual
> collation from the nested Aggref? Why should it be InvalidOid?

It seems quite random to me to do that. If the trans type is bytea,
why would it be useful to inherit the collation from the aggregate?
I'm not confident I'm right with InvalidOId... I just don't think we
can pretend the collation is the same as the Aggref's.


> 2) partial_aggregate_walker
> ---
>
> I think this should follow the naming convention that clearly identifies the
> purpose of the walker, not what kind of nodes it is supposed to walk. So it
> should be:
>
> aggregates_allow_partial_walker

Agreed and changed.

> 3) create_parallelagg_path
> --
>
> I do agree with the logic that under-estimates are more dangerous than
> over-estimates, so the current estimate is safer. But I think this would be
> a good place to apply the formula I proposed a few days ago (or rather the
> one Dean Rasheed proposed in response).
>
> That is, we do know that there are numGroups in total, and each parallel
> worker sees subpath->rows then it's expected to see
>
> sel = (subpath->rows / rel->tuples);
> perGroup = (rel->tuples / numGroups);
> workerGroups = numGroups * (1 - powl(1 - s, perGroup));
> numPartialGroups = numWorkers * workerGroups
>
> It's probably better to see Dean's message from 13/3.

I think what I have works well when there's a small number of groups,
as there's a good chance that each worker will see at least 1 tuple
from each group. However I understand that will become increasingly
unlikely with a larger number of groups, which is why I capped it to
the total input rows, but even in cases before that cap is reached I
think it will still overestimate. I'd need to analyze the code above
to understand it better, but my initial reaction is that, you're
probably right, but I don't think I want to inherit the fight for
this. Perhaps it's better to wait until GROUP BY estimate improvement
patch gets in, and change this, or if this gets in first, then you can
include this change in your patch. I'm not trying to brush off the
work, I just would rather it didn't delay parallel aggregate.

> 4) Is clauses.h the right place for PartialAggType?

I'm not sure that it is to be honest. I just put it there because the
patch never persisted the value of a PartialAggType in any struct
field anywhere and checks it later in some other file. In all places
where we use PartialAggType we're also calling
aggregates_allow_partial(), which does require clauses.h. So that's
why it ended up there... I think I'll leave it there until someone
gives me a good reason to move it.

An updated patch will follow soon.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-14 Thread David Rowley
On 15 March 2016 at 11:24, James Sewell  wrote:
> On Tuesday, 15 March 2016, Robert Haas  wrote:
>>
>> > Does the cost of the aggregate function come into this calculation at
>> > all? In PostGIS land, much smaller numbers of rows can generate loads
>> > that would be effective to parallelize (worker time much >> than
>> > startup cost).
>>
>> Unfortunately, no - only the table size.  This is a problem, and needs
>> to be fixed.  However, it's probably not going to get fixed for 9.6.
>> :-(
>
>
> Any chance of getting a GUC (say min_parallel_degree) added to allow setting
> the initial value of parallel_degree, then changing the small relation check
> to also pass if parallel_degree > 1?
>
> That way you could set min_parallel_degree on a query by query basis if you
> are running aggregates which you know will take a lot of CPU.
>
> I suppose it wouldn't make much sense at all to set globally though, so it
> could just confuse matters.

I agree that it would be nice to have more influence on this decision,
but let's start a new thread for that. I don't want this one getting
bloated with debates on that. It's not code I'm planning on going
anywhere near for this patch.

I'll start a thread...

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Choosing parallel_degree

2016-03-14 Thread David Rowley
Over in [1] James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision.  At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

Here's the output of the C program.

For 1 pages there will be 1 workers (rel size 0 MB, 0 GB)
For 3001 pages there will be 2 workers (rel size 23 MB, 0 GB)
For 9001 pages there will be 3 workers (rel size 70 MB, 0 GB)
For 27001 pages there will be 4 workers (rel size 210 MB, 0 GB)
For 81001 pages there will be 5 workers (rel size 632 MB, 0 GB)
For 243001 pages there will be 6 workers (rel size 1898 MB, 1 GB)
For 729001 pages there will be 7 workers (rel size 5695 MB, 5 GB)
For 2187001 pages there will be 8 workers (rel size 17085 MB, 16 GB)
For 6561001 pages there will be 9 workers (rel size 51257 MB, 50 GB)
For 19683001 pages there will be 10 workers (rel size 153773 MB, 150 GB)
For 59049001 pages there will be 11 workers (rel size 461320 MB, 450 GB)
For 177147001 pages there will be 12 workers (rel size 1383960 MB, 1351 GB)
For 531441001 pages there will be 13 workers (rel size 4151882 MB, 4054 GB)
For 1594323001 pages there will be 14 workers (rel size 12455648 MB, 12163 GB)

[1] 
http://www.postgresql.org/message-id/CANkGpBtUvzpdvF2=_iq64ujmvrpycs6d4i9-wepbusq1sq+...@mail.gmail.com

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
#include 
#include 

int max_parallel_degree = 64;

#define BLOCKSZ (8*1024)

#define MULTIPLIER 3
int
choose_degree(unsigned int pages)
{
	int		parallel_threshold = 1000;
	int		parallel_degree = 1;

	/*
	 * If this relation is too small to be worth a parallel scan, just return
	 * without doing anything ... unless it's an inheritance child.  In that case,
	 * we want to generate a parallel path here anyway.  It might not be worthwhile
	 * just for this relation, but when combined with all of its inheritance siblings
	 * it may well pay off.
	 */
	if (pages < parallel_threshold)
		return parallel_degree;

	/*
	 * Limit the degree of parallelism logarithmically based on the size of the
	 * relation.  This probably needs to be a good deal more sophisticated, but we
	 * need something here for now.
	 */
	while (pages > parallel_threshold * 3 &&
		   parallel_degree < max_parallel_degree)
	{
		parallel_degree++;
		parallel_threshold *= 3;
		if (parallel_threshold >= INT_MAX / 3)
			break;
	}
	return parallel_degree;
}

int
main(void)
{
	unsigned int pages;
	int last_workers = -1;

	for (pages = 1; pages != 0; pages += 1)
	{
		int workers = choose_degree(pages);

		if (workers != last_workers)
		{
			printf("For %u pages there will be %d workers (rel size %llu MB, %llu GB)\n", pages, workers,
(unsigned long long) pages * BLOCKSZ / 1024 / 1024,
(unsigned long long) pages * BLOCKSZ / 1024 / 1024 / 1024);
			last_workers = workers;
		}
	}
	return 0;
}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Choosing parallel_degree

2016-03-14 Thread David Rowley
On 15 March 2016 at 15:24, James Sewell  wrote:
>
> I did want to test with some really slow aggs, but even when I take out the 
> small table test in create_parallel_paths I can't seem to get a parallel plan 
> for a tiny table. Any idea on why this would be David?

In the test program I attached to the previous email, if I change the
parallel_threshold = 1000; to be parallel_threshold = 1; then I get
the following output:

For 1 pages there will be 1 workers (rel size 0 MB, 0 GB)
For 4 pages there will be 2 workers (rel size 0 MB, 0 GB)

So I'm getting 2 workers for only 4 pages. I've not tested in
Postgres, but if you do this and: SET parallel_setup_cost = 0; then
I'd imagine it should generate a parallel plan.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 15 March 2016 at 13:48, David Rowley  wrote:
> An updated patch will follow soon.

I've attached an updated patch which addresses some of Robert's and
Tomas' concerns.
I've not done anything about the exprCollation() yet, as I'm still
unsure of what it should do. I just don't see why returning the
Aggref's collation is correct, and we have nothing else to return.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


0001-Allow-aggregation-to-happen-in-parallel_2016-03-16.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-15 Thread David Steele
On 3/8/16 5:44 PM, Robbie Harwood wrote:

> Here's yet another version of GSSAPI encryption support.

OK, everything seems to be working fine with a 9.6 client and server so
next I tested older clients and I got this error:

$ /usr/lib/postgresql/9.1/bin/psql -h localhost \
  -U vagr...@pgmasters.net postgres
psql: FATAL:  GSSAPI did no provide required flags

There's a small typo in the error message, BTW.

Here's the output from the server logs:

DEBUG:  forked new backend, pid=7746 socket=9
DEBUG:  postgres child[7746]: starting with (
DEBUG:  postgres
DEBUG:  )
DEBUG:  InitPostgres
DEBUG:  my backend ID is 2
DEBUG:  StartTransaction
DEBUG:  name: unnamed; blockState:   DEFAULT; state: INPROGR,
xid/subid/cid: 0/1/0, nestlvl: 1, children:
DEBUG:  Processing received GSS token of length 667
DEBUG:  gss_accept_sec_context major: 0, minor: 0, outlen: 161,
outflags: 1b2
DEBUG:  sending GSS response token of length 161
DEBUG:  sending GSS token of length 161
FATAL:  GSSAPI did no provide required flags
DEBUG:  shmem_exit(1): 1 before_shmem_exit callbacks to make
DEBUG:  shmem_exit(1): 6 on_shmem_exit callbacks to make
DEBUG:  proc_exit(1): 3 callbacks to make
DEBUG:  exit(1)
DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  reaping dead processes
DEBUG:  server process (PID 7746) exited with exit code 1

I got the same result with a 9.4 client so didn't try any others.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-03-15 Thread David Steele
On 3/15/16 1:17 AM, Amit Kapila wrote:

> On Tue, Mar 15, 2016 at 12:00 AM, David Steele 
>> This patch no longer applies cleanly:
>>
>> $ git apply ../other/group_update_clog_v6.patch
>> error: patch failed: src/backend/storage/lmgr/proc.c:404
>> error: src/backend/storage/lmgr/proc.c: patch does not apply
>> error: patch failed: src/include/storage/proc.h:152
>> error: src/include/storage/proc.h: patch does not apply
> 
> For me, with patch -p1 <  it works, but any how I have
> updated the patch based on recent commit.  Can you please check the
> latest patch and see if it applies cleanly for you now.

Yes, it now applies cleanly (101fd93).

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-03-15 Thread David Steele
On 3/14/16 12:27 PM, Artur Zakirov wrote:
> On 14.03.2016 18:48, David Steele wrote:
>> Hi Jeff,
>>
>> On 2/25/16 5:00 PM, Jeff Janes wrote:
>>
>>> But, It doesn't sound like I am going to win that debate.  Given that,
>>> I don't think we need a different name for the function. I'm fine with
>>> explaining the word-boundary subtlety in the documentation, and
>>> keeping the function name itself simple.
>>
>> It's not clear to me if you are requesting more documentation here or
>> stating that you are happy with it as-is.  Care to elaborate?
>>
>> Other than that I think this patch looks to be ready for committer. Any
>> objections?
>>
> 
> There was some comments about the word-boundary subtlety. But I think it
> was not enough.
> 
> I rephrased the explanation of word_similarity() and %>. It is better now.
> 
> But if it is not correct I can change the explanation.

Since to only change in the latest patch is to documentation I have
marked this "ready for committer".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Weighted Stats

2016-03-15 Thread David Fetter
On Fri, Jan 08, 2016 at 04:37:36PM +1100, Haribabu Kommi wrote:
> On Mon, Dec 21, 2015 at 1:50 PM, David Fetter  wrote:
> > On Sun, Dec 20, 2015 at 06:13:33PM -0600, Jim Nasby wrote:
> >> On 11/2/15 5:46 PM, David Fetter wrote:
> >> >I'd like to add weighted statistics to PostgreSQL
> >>
> >> Anything happen with this? If community isn't interested, ISTM it'd be good
> >> to put this in PGXN.
> >
> > I think it's already in PGXN as an extension, and I'll get another
> > version out this early this week, as it involves mostly adding some
> > tests.
> >
> > I'll do the float8 ones for core this week, too, and unless there's a
> > really great reason to do more data types on the first pass, it should
> > be in committable shape.
> 
> I reviewed the patch, following are my observations.
> 
> 1. +   precision, numeric, or interval
> 
> with interval type it is giving problem. As interval data type is not 
> supported,
> so remove it in the list of supported inputs.

Done.

> 
> 2. +float8_weighted_avg(PG_FUNCTION_ARGS)
> 
> It will be helpful, if you provide some information as a function header,
> how the weighted average is calculated similar like other weighted functions.

Done.

> 3. + transvalues = check_float8_array(transarray,
> "float8_weighted_stddev_accum", 4);
> 
> The second parameter to check_float8_array should be "float8_weighted_accum".

Done.

> 4. There is an OID conflict of 4066 with latest master code.

Fixed.

> 5.+ A += newvalW * ( newvalX - transvalues[2] ) / W;
> + CHECKFLOATVAL(A, isinf(newvalW) || isinf(newvalX - transvalues[2])
> || isinf(1.0/W), true);
> 
> + Q += newvalW * (newvalX - transvalues[2]) * (newvalX - A);
> + CHECKFLOATVAL(A, isinf(newvalX -  transvalues[3]) || isinf(newvalX -
> A) || isinf(1.0/W), true);
> 
> 
> Is the need of calculation also needs to be passed to CHECKFLOATVAL?
> Just passing
> the variables involved in the calculation isn't enough? If expressions
> are required then
> it should be something as follows?
> 
> CHECKFLOATVAL(A, isinf(transvalues[2]) || isinf(newvalW) ||
> isinf(newvalX - transvalues[2]) || isinf(1.0/W), true);
> 
> CHECKFLOATVAL(Q, isinf(transvalues[3]) || isinf(newvalX -
> transvalues[2]) || isinf(newvalX - A) || isinf(1.0/W), true);

Done.

Please find attached a patch that uses the float8 version to cover the
numeric types.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 000489d..09ada7e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12645,6 +12645,29 @@ NULL baz(3 rows)
  
   

+weighted_average
+   
+   
+weighted_avg
+   
+   weighted_avg(value 
expression, weight 
expression)
+  
+  
+   smallint, int,
+   bigint, real, double
+   precision, numeric, or interval
+  
+  
+   numeric for any integer-type argument,
+   double precision for a floating-point argument,
+   otherwise the same as the argument data type
+  
+  the average (arithmetic mean) of all input values, weighted by 
the input weights
+ 
+
+ 
+  
+   
 bit_and

bit_and(expression)
@@ -13288,6 +13311,29 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y 
DESC) AS tab;
  
   

+weighted standard deviation
+population
+   
+   
+weighted_stddev_pop
+   
+   weighted_stddev_pop(value 
expression, weight 
expression)
+  
+  
+   smallint, int,
+   bigint, real, double
+   precision, or numeric
+  
+  
+   double precision for floating-point arguments,
+   otherwise numeric
+  
+  weighted population standard deviation of the input values
+ 
+
+ 
+  
+   
 standard deviation
 sample

@@ -13311,6 +13357,29 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y 
DESC) AS tab;
  
   

+weighted standard deviation
+sample
+   
+   
+weighted_stddev_samp
+   
+   weighted_stddev_samp(value 
expression, weight 
expression)
+  
+  
+   smallint, int,
+   bigint, real, double
+   precision, or numeric
+  
+  
+   double precision for floating-point arguments,
+   otherwise numeric
+  
+  weighted sample standard deviation of the input values
+ 
+
+ 
+  
+   
 variance

variance(expression)
diff -

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-03-15 Thread David Steele
Hi Michael,

On 3/14/16 7:07 PM, Michael Paquier wrote:

> On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier  
> wrote:
>
>> On Mon, Mar 14, 2016 at 4:32 PM, David Steele  wrote:
>>
>>> Could you provide an updated set of patches for review?  Meanwhile I am
>>> marking this as "waiting for author".
>>
>> Sure. I'll provide them shortly with all the comments addressed. Up to
>> now I just had a couple of comments about docs and whitespaces, so I
>> didn't really bother sending a new set, but this meritates a rebase.
> 
> And here they are. I have addressed the documentation and the
> whitespaces reported up to now at the same time.

For this first review I would like to focus on the user visible changes
introduced in 0001-0002.

First I created two new users with each type of supported verifier:

postgres=# create user test with password 'test';
CREATE ROLE
postgres=# create user testu with unencrypted password 'testu'
   valid until '2017-01-01';
CREATE ROLE

1) I see that rolvaliduntil is still in pg_authid:

postgres=# select oid, rolname, rolvaliduntil from pg_authid;

  oid  | rolname | rolvaliduntil
---+-+
10 | vagrant |
 16387 | test|
 16388 | testu   | 2017-01-01 00:00:00+00

I think that's OK if we now define it to be "role validity" (it's still
password validity in the patched docs).  I would also like to see a
validuntil column in pg_auth_verifiers so we can track password
expiration for each verifier separately.  For now I think it's enough to
copy the same validity both places since there can only be one verifier.

2) I don't think the column naming in pg_auth_verifiers is consistent
with other catalogs:

postgres=# select * from pg_auth_verifiers;

 roleid | verimet |   verival
+-+-
  16387 | m   | md505a671c66aefea124cc08b76ea6d30bb
  16388 | p   | testu

System catalogs generally use a 3 character prefix so I would expect the
columns to be (if we pick avr as a prefix):

avrrole
avrmethod
avrverifier
avrvaliduntil

I'm not a big fan in abbreviating too much so you can see I've expanded
the names a bit.

3) rolpassword is still in pg_shadow even though it is not useful anymore:

postgres=# select usename, passwd, valuntil from pg_shadow;

 usename |  passwd  |valuntil
-+--+
 vagrant |  |
 test|  |
 testu   |  | 2017-01-01 00:00:00+00

If anyone is actually using this column in a meaningful way they are in
for a nasty surprise when trying use the value in passwd as a verifier.
 I would prefer to drop the column entirely and produce a clear error.

Perhaps a better option would be to drop pg_shadow entirely since it
seems to have no further purpose in life.

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-15 Thread David Steele
On 3/15/16 10:39 AM, Michael Paquier wrote:

> On Thu, Mar 10, 2016 at 4:25 AM, Andres Freund wrote:
> 
>> Note that we currently have some frontend programs with the equivalent
>> problem. Most importantly receivelog.c (pg_basebackup/pg_recveivexlog)
>> are missing pretty much the same directory fsyncs.  And at least for
>> pg_recvxlog it's critical, especially now that receivexlog support
>> syncrep.  I've not done anything about that; there's pretty much no
>> chance to share backend code with the frontend in the back-branches.
> 
> Yeah, true. We definitely need to do something for that, even for HEAD
> it seems like an overkill to have something in for example src/common
> to allow frontends to have something if the fix is localized
> (pg_rewind may use something else), and it would be nice to finish
> wrapping that for the next minor release, so I propose the attached
> patches.

I noticed this when reviewing the pg_receive_xlog refactor and was going
to submit a patch after the CF.  It didn't occur to me to piggyback on
this work but I think it makes sense.

+1 from me for fixing this in pg_receivexlog and back-patching.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE behind-the-scenes effects' CONTEXT

2016-03-15 Thread David Steele
On 1/30/16 10:52 AM, Marko Tiikkaja wrote:
> On 2016-01-21 04:17, Simon Riggs wrote:
>> Marko, I was/am waiting for an updated patch. Could you comment please?
> 
> Sorry, I've not found time to work on this recently.
> 
> Thanks for everyone's comments so far.  I'll move this to the next CF
> and try and get an updated patch done in time for that one.

There's been no activity on this thread for while and no new patch was
submitted for the CF.

I have marked it "returned with feedback".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2016-03-15 Thread David Steele
On 3/15/16 1:42 PM, Robert Haas wrote:
> On Fri, Feb 26, 2016 at 2:37 AM, Kyotaro HORIGUCHI
>  wrote:
>> Hello, this is the new version of this patch.
> 
> The CommitFest entry for this patch is messed up.  It shows no author,
> even though I'm pretty sure that a patch has to have one by
> definition.
> 
> https://commitfest.postgresql.org/9/518/
> 
> Also, this patch was listed as Waiting on Author, but it's been
> updated since it was last reviewed, so I switched it back to Needs
> Review.  Can someone please review it and, if appropriate, mark it
> Ready for Committer?

Author has been set to Kyotaro Horiguchi.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

2016-03-15 Thread David Steele
Hi Kevin,

On 3/1/16 11:08 AM, Roma Sokolov wrote:
>> On 27 Feb 2016, at 03:46, Euler Taveira  wrote:
>> Because it is not a common practice to test catalog dependency on
>> separate tests (AFAICS initial catalogs are tested with oidjoins.sql).
>> Also, your test case is too huge for such a small use case. 
> 
> It seems oidjoins.sql is automatically generated and contains tests only for
> initial database state. On the other hand, there are tests for CREATE OPERATOR
> and ALTER OPERATOR, so it seems reasonable to me to have separate DROP 
> OPERATOR
> test, or to move all operator related testing to one file. This is however
> clearly outside of the scope of this patch, so in v3 I've simplified tests 
> using
> queries from oidjoins.sql.

You've signed up to review this patch, do you have an idea of when you
might be able to do the review?

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-15 Thread David Steele
On 3/3/16 12:16 AM, Haribabu Kommi wrote:
> On Fri, Feb 5, 2016 at 2:29 PM, Haribabu Kommi  
> wrote:
>>
>> This patch needs to be applied on top discard_hba_and_ident_cxt patch
>> that is posted earlier.
> 
> Here I attached a re-based patch to the latest head with inclusion of
> discard_hba_ident_cxt patch for easier review as a single patch.

Alex, Scott, do you have an idea of when you'll be able to review this
new version?

It applies with a minor conflict (caused by pg_control_* commit):

$ git apply -3 ../other/pg_hba_lookup_poc_v13.patch
error: patch failed: src/include/utils/builtins.h:1151
Falling back to three-way merge...
Applied patch to 'src/include/utils/builtins.h' with conflicts.
U src/include/utils/builtins.h

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script & other stuff

2016-03-15 Thread David Steele
On 3/4/16 1:53 PM, Fabien COELHO wrote:

>>> That is why the "fs" variable in process_file is declared "static",
>>> and why
>>> I wrote "some hidden awkwarness".
>>>
>>> I did want to avoid a malloc because then who would free the struct?
>>> addScript cannot to it systematically because builtins are static. Or it
>>> would have to create an on purpose struct, but I then that would be more
>>> awkwarness, and malloc/free to pass arguments between functions is not
>>> efficient nor very elegant.
>>>
>>> So the "static" option looked like the simplest & most elegant version.
>>
>> Surely that trick breaks if you have more than one -f switch, no?  Oh, I
>> see what you're doing: you only use the command list, which is
>> allocated, so it doesn't matter that the rest of the struct changes
>> later.
> 
> The two fields that matter (desc and commands) are really copied into
> sql_scripts, so what stays in the is overriden if used another time.
> 
>> I'm not concerned about freeing the struct; what's the problem with it
>> surviving until the program terminates?
> 
> It is not referenced anywhere so it is a memory leak.
> 
>> If somebody specifies thousands of -f switches, they will waste a few
>> bytes with each, but I'm hardly concerned about a few dozen kilobytes
>> there ...
> 
> Ok, so you prefer a memory leak. I hate it on principle.
> 
> Here is a v23 with a memory leak anyway.

Álvaro, it looks like you've been both reviewer and committer on this
work for some time.

The latest patch seems to address you final concern.  Can I mark it
"ready for committer"?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)

2016-03-15 Thread David Steele
On 3/4/16 2:56 PM, Vitaly Burovoy wrote:

> On 3/4/16, Anastasia Lubennikova  wrote:
>
>> I think that you should update documentation. At least description of
>> epoch on this page:
>> http://www.postgresql.org/docs/devel/static/functions-datetime.html
> 
> Thank you very much for pointing where it is located (I saw only
> "to_timestamp(TEXT, TEXT)").
> I'll think how to update it.

Vitaly, have you decided how to update this yet?

>> 3. (nitpicking) I don't sure about "4STAMPS" suffix. "4" is nice
>> abbreviation, but it seems slightly confusing to me.
> 
> It doesn't matter for me what it is called, it is short enough and
> reflects a type on which it is applied.
> What would the best name be for it?

Anastasia, any suggestions for a better name, or just leave it as is?

I'm not in favor of the "4", either.  I think I would prefer
JULIAN_MAXYEAR_STAMP.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Combining Aggregates

2016-03-15 Thread David Rowley
On 16 March 2016 at 06:39, Tomas Vondra  wrote:
> After looking at the parallel aggregate patch, I also looked at this one, as
> it's naturally related. Sadly I haven't found any issue that I could nag
> about ;-) The patch seems well baked, as it was in the oven for quite a long
> time already.

Thanks for looking at this.

> The one concern I do have is that it only adds (de)serialize functions for
> SUM(numeric) and AVG(numeric). I think it's reasonable not to include that
> into the patch, but it will look a bit silly if that's all that gets into
> 9.6.

Yes me too, so I spent several hours yesterday writing all of the
combine functions and serialisation/deserialisation that are required
for all of SUM(), AVG() STDDEV*(). I also noticed that I had missed
using some existing functions for bool_and() and bool_or() so I added
those to pg_aggregate.h. I'm just chasing down a crash bug on
HAVE_INT128 enabled builds, so should be posting a patch quite soon. I
didn't touch the FLOAT4 and FLOAT8 aggregates as I believe Haribabu
has a patch for that over on the parallel aggregate thread. I've not
looked at it in detail yet.

If Haribabu's patch does all that's required for the numerical
aggregates for floating point types then the status of covered
aggregates is (in order of pg_aggregate.h):

* AVG() complete coverage
* SUM() complete coverage
* MAX() complete coverage
* MIN() complete coverage
* COUNT() complete coverage
* STDDEV + friends complete coverage
* regr_*,covar_pop,covar_samp,corr not touched these.
* bool*() complete coverage
* bitwise aggs. complete coverage
* Remaining are not touched. I see diminishing returns with making
these parallel for now. I think I might not be worth pushing myself
any harder to make these ones work.

Does what I have done + floating point aggs from Haribabu seem
reasonable for 9.6?

> I think it would be good to actually document which aggregates support
> parallel execution, and which don't. Currently the docs don't mention this
> at all, and it's tricky for regular users to deduce that as it's related to
> the type of the internal state and not to the input types. An example of
> that is the SUM(bigint) example mentioned above.

I agree. I will look for a suitable place.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Aggregate

2016-03-15 Thread David Rowley
On 16 March 2016 at 09:23, Robert Haas  wrote:
> On Mon, Mar 14, 2016 at 7:56 PM, David Rowley
>  wrote:
>> A comment does explain this, but perhaps it's not good enough, so I've
>> rewritten it to become.
>>
>> /*
>>  * PartialAggref
>>  *
>>  * When partial aggregation is required in a plan, the nodes from the partial
>>  * aggregate node, up until the finalize aggregate node must pass the 
>> partially
>>  * aggregated states up the plan tree. In regards to target list construction
>>  * in setrefs.c, this requires that exprType() returns the state's type 
>> rather
>>  * than the final aggregate value's type, and since exprType() for Aggref is
>>  * coded to return the aggtype, this is not correct for us. We can't fix this
>>  * by going around modifying the Aggref to change it's return type as 
>> setrefs.c
>>  * requires searching for that Aggref using equals() which compares all 
>> fields
>>  * in Aggref, and changing the aggtype would cause such a comparison to fail.
>>  * To get around this problem we wrap the Aggref up in a PartialAggref, this
>>  * allows exprType() to return the correct type and we can handle a
>>  * PartialAggref in setrefs.c by just peeking inside the PartialAggref to 
>> check
>>  * the underlying Aggref. The PartialAggref lives as long as executor 
>> start-up,
>>  * where it's removed and replaced with it's underlying Aggref.
>>  */
>> typedef struct PartialAggref
>>
>> does that help explain it better?
>
> I still think that's solving the problem the wrong way.  Why can't
> exprType(), when applied to the Aggref, do something like this?
>
> {
> Aggref *aref = (Aggref *) expr;
> if (aref->aggpartial)
> return aref->aggtranstype;
> else
> return aref->aggtype;
> }
>
> The obvious answer is "well, because those fields don't exist in
> Aggref".  But shouldn't they?  From here, it looks like PartialAggref
> is a cheap hack around not having whacked Aggref around hard for
> partial aggregation.

We could do it that way if we left the aggpartial field out of the
equals() check, but I think we go at length to not do that. Just look
at what's done for all of the location fields. In any case if we did
that then it might not actually be what we want all of the time...
Perhaps in some cases we'd want equals() to return false when the
aggpartial does not match, and in other cases we'd want it to return
true. There's no way to control that behaviour, so to get around the
setrefs.c problem I created the wrapper node type, which I happen to
think is quite clean. Just see Tom's comments about Haribabu's temp
fix for the problem where he put some hacks into the equals for aggref
in [1].

>>> I don't see where this applies has_parallel_hazard or anything
>>> comparable to the aggregate functions.  I think it needs to do that.
>>
>> Not sure what you mean here.
>
> If the aggregate isn't parallel-safe, you can't do this optimization.
> For example, imagine an aggregate function written in PL/pgsql that
> for some reason writes data to a side table.  It's
> has_parallel_hazard's job to check the parallel-safety properties of
> the functions used in the query.

Sorry, I do know what you mean by that. I might have been wrong to
assume that the parallelModeOK check did this. I will dig into how
that is set exactly.

>>> +   /* XXX +1 ? do we expect the main process to actually do real work? 
>>> */
>>> +   numPartialGroups = Min(numGroups, subpath->rows) *
>>> +   (subpath->parallel_degree + 
>>> 1);
>>> I'd leave out the + 1, but I don't think it matters much.
>>
>> Actually I meant to ask you about this. I see that subpath->rows is
>> divided by the Path's parallel_degree, but it seems the main process
>> does some work too, so this is why I added + 1, as during my tests
>> using a query which produces 10 groups, and had 4 workers, I noticed
>> that Gather was getting 50 groups back, rather than 40, I assumed this
>> is because the main process is helping too, but from my reading of the
>> parallel query threads I believe this is because the Gather, instead
>> of sitting around idle tries to do a bit of work too, if it appears
>> that nothing else is happening quickly enough. I should probably go
>> read nodeGather.c to learn that though.
>
> Yes, the main process does do some work, but less and less as the
> query gets more complicated.  See comments in cost_seqscan().

Thanks

[1] http://www.postgresql.org/message-id/10158.1457329...@sss.pgh.pa.us

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   4   5   6   7   8   9   10   >