date:20170405

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-04-05 Thread Pavel Stehule

2017-04-05 22:33 GMT+02:00 Andres Freund :

> Hi,
>
>
> I'd like some input from other committers whether we want this.  I'm
> somewhat doubtful, but don't have particularly strong feelings.
>
>
> > +
> > +  
> > +   Block level PRAGMA
> > +
> > +   
> > +PRAGMA
> > +in PL/pgSQL
> > +   
> > +
> > +   
> > +The block level PRAGMA allows to change the
> > +PL/pgSQL compiler behavior. Currently
> > +only PRAGMA PLAN_CACHE is supported.
>
> Why are we doing this on a block level?
>

There are few reasons:

1. it is practical for some cases to mix more plan strategies in one
function

a)

FOR IN simple_select
LOOP
  ENFORCE ONE SHOT PLANS
  BEGIN
  .. queries ..
  END;
END LOOP;


b)

ENFORCE ONE SHOT PLANS
BEGIN
  FOR IN complex query requires one shot plan
  LOOP
RETURNS TO DEFAULT PLAN CACHE
BEGIN
  .. queries ..
END;
  END LOOP;

2. This behave is defined in Ada language, and in PL/SQL too. If we will
have autonomous transactions, then we can have a equal functionality

a) run complete function under autonomous transaction
b) run some parts of function (some blocks) under autonomous transaction

It is not necessary, but it can avoid to generate auxiliary functions.


>
>
> > +
> > +CREATE FUNCTION enforce_fresh_plan(_id text) RETURNS boolean AS $$
> > +DECLARE
> > +  PRAGMA PLAN_CACHE(force_custom_plan);
> > +BEGIN
> > +  -- in this block every embedded query uses one shot plan
>
> *plans
>
>
> > +
> > + PRAGMA PLAN_CACHE
> > +
> > + 
> > +  The plan cache behavior can be controlled using
> > +  PRAGMA PLAN_CACHE. This PRAGMA can be
> used both
> > +  for whole function or in individual blocks. The following options
> are
>
> *functions
>
>
> > +  possible: DEFAULT - default
> > +  PL/pgSQL implementation - the system
> tries
> > +  to decide between custom plan and generic plan after five query
> > +  executions, FORCE_CUSTOM_PLAN - the chosen
> execution
> > +  plan will be the one shot plan - it is specific for every set of
> > +  used paramaters, FORCE_GENERIC_PLAN - the
> generic
> > +  plan will be used from the start.
>
> I don't think it's a good idea to explain this here, this'll just get
> outdated.  I think we should rather have a link here.
>
>
> > + 
> > +
> > + 
> > +  
> > +   PRAGMA PLAN_CACHE
> > +   in PL/pgSQL
> > +  
> > +  The plan for INSERT is always a generic
> > plan.
>
> That's this specific insert, right? Should be mentioned, sounds more
> generic to me.
>
> > +/* --
> > + * Returns pointer to current compiler settings
> > + * --
> > + */
> > +PLpgSQL_settings *
> > +plpgsql_current_settings(void)
> > +{
> > + return current_settings;
> > +}
> > +
> > +
> > +/* --
> > + * Setup default compiler settings
> > + * --
> > + */
> > +void
> > +plpgsql_settings_init(PLpgSQL_settings *settings)
> > +{
> > + current_settings = settings;
> > +}
>
> Hm. This is only ever set to _settings.
>
>
> > +/* --
> > + * Set compiler settings
> > + * --
> > + */
> > +void
> > +plpgsql_settings_set(PLpgSQL_settings *settings)
> > +{
> > + PLpgSQL_nsitem *ns_cur = ns_top;
> > +
> > + /*
> > +  * Modify settings directly, when ns has local settings data.
> > +  * When ns uses shared settings, create settings first.
> > +  */
> > + while (ns_cur->itemtype != PLPGSQL_NSTYPE_LABEL)
> > + ns_cur = ns_cur->prev;
> > +
> > + if (ns_cur->local_settings == NULL)
> > + {
> > + ns_cur->local_settings = palloc(sizeof(PLpgSQL_settings));
> > + ns_cur->local_settings->prev = current_settings;
> > + current_settings = ns_cur->local_settings;
> > + }
> > +
> > + current_settings->cursor_options = settings->cursor_options;
> > +}
>
> This seems like a somewhat weird method.  Why do we have a global
> settings, when we essentially just want to use something in the current
> ns?
>
>
I am not sure if I understand to question.

This settings is implemented as lazy. If ns has not any own settings, then
nothing is done. It requires some global variable, because some ns can be
skipped.

My first implementation was 1:1 .. ns:settings - but it add some overhead
for any ns although ns has not own settings.

Regards

Pavel



>
>
> - Andres
>

Re: [HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-04-05 Thread Pavel Stehule

2017-04-06 3:34 GMT+02:00 Stephen Frost :

> Andres,
>
> * Andres Freund (and...@anarazel.de) wrote:
> > On 2017-04-05 21:07:59 -0400, Stephen Frost wrote:
> > > * Andres Freund (and...@anarazel.de) wrote:
> > > > I don't yet have a good idea how to deal with moving individual cells
> > > > into files, so they can be loaded.  One approach would be to to have
> > > > something like
> > > >
> > > > \storequeryresult filename_template.%row.%column
> > > >
> > > > which'd then print the current query buffer into the relevant file
> after
> > > > doing replacement on %row and %column.
> > >
> > > I don't actually agree that there's a problem having the output from a
> > > query stored direclty in binary form into a single file.  The above
> > > approach seems to imply that every binary result must go into an
> > > independent file, and perhaps that would be useful in some cases, but I
> > > don't see it as required.
> >
> > Well, it'd not be enforced - it'd depend on your template.  But for a
> > lot of types of files, it'd not make sense to store multiple
> > columns/rows in file. Particularly for ones where printing them out to
> > files is actually meaningful (i.e. binary ones).
>
> Having the template not require the row/column place-holders included
> strikes me as more likely to be confusing.  My initial thinking around
> this was that users who actually want independent files would simply
> issue independent queries, while users who want to take a bunch of int4
> columns and dump them into a single binary file would be able to do so
> easily.
>
> I'm not against adding the ability for a single query result to be saved
> into independent files, but it strikes me as feature creep on this basic
> capability.  Further, I don't see any particular reason why splitting up
> the output from a query into multiple files is only relevant for binary
> data.
>

The files can be simply joined together outside psql

Personally I prefer relation type: single field, single file  in special g
command - because I can simply off all formatting and result should be
correct every time.

Stephen, have you some use case for your request?

Regards

Pavel


> Thanks!
>
> Stephen
>

Re: [HACKERS] BRIN cost estimate

2017-04-05 Thread David Rowley

On 5 April 2017 at 17:34, Emre Hasegeli  wrote:
>
>> Interested to hear comments on this.
>
>
> I don't have chance to test it right now, but I am sure it would be an
> improvement over what we have right now.  There is no single correct
> equation with so many unknowns we have.
>
>>   *indexTotalCost += (numTuples * *indexSelectivity) *
>> (cpu_index_tuple_cost + qual_op_cost);
>
> Have you preserved this line intentionally?  This would make BRIN index scan
> cost even higher, but the primary property of BRIN is to be cheap to scan.
> Shouldn't bitmap heap scan node take into account of checking the tuples in
> its cost?

Good point. That's wrong, but I'm confused at why you kept the:

+ *indexTotalCost += selec * numTuples * cpu_index_tuple_cost;

at all if that's the case. All the BRIN scan is going to do is build a
bitmap of the matching ranges found.

I've ended up with:

+ /*
+ * Charge a small amount per range tuple which we expect to match to. This
+ * is meant to reflect the costs of manipulating the bitmap. The BRIN scan
+ * will set a bit for each page in the range when we find a matching
+ * range, so we must multiply the charge by the number of pages in the
+ * range.
+ */
+ *indexTotalCost += 0.1 * cpu_operator_cost * estimatedRanges *
+ pagesPerRange;

Which is inspired from cost_bitmap_tree_node(), and seems like a fair
cost for setting a single bit.

I also noticed that you were doing:

+ if (get_index_stats_hook &&
+ (*get_index_stats_hook) (root, index->indexoid, 1, ))

and

+ vardata.statsTuple = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(index->indexoid),
+ Int16GetDatum(1),

I wondered why you picked to hardcode that as 1, and I thought that's
surely a bug. But then looking deeper it seems to be copied from
btcostestimate(), which also seems to be a long-standing bug
introduced in a536ed53bca40cb0d199824e358a86fcfd5db7f2. I'll go write
a patch for that one now.

I've been looking at the estimates output by the following:

create table ab (a int, b int);
insert into ab select x/1000,x%1000 from generate_series(1,100)x;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 128);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 64);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 32);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 16);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 8);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 4);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 2);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;
create index ab_a_idx on ab using brin (a) with (pages_per_range = 1);
analyze ab;
explain analyze select * from ab where a = 1;
drop index ab_a_idx;

and I see that the bitmap index scan gets more expensive the more
ranges that are in the index, which of course makes sense. The small
bitmap maintenance cost I added should stay about the same, since I
multiplied it by the pagesPerRange, it may only drop slightly as it
may expect to set fewer bits in the bitmap due to the ranges being
more specific to the tuples in the heap. The row estimates seem pretty
sane there too, based on how many were filtered out on the recheck.

I do still have concerns about how the correlation value is used, and
the fact that we use the highest value, but then the whole use of this
value is far from perfect, and is just a rough indication of how
things are.

Apart from that, I'm pretty happy with this now.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

brin-correlation-drowley_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SCRAM authentication, take three

2017-04-05 Thread Noah Misch

On Tue, Mar 07, 2017 at 02:36:13PM +0200, Heikki Linnakangas wrote:
> I didn't include the last-minute changes to the way you specify this in
> pg_hba.conf. So it's still just "scram". I agree in general that we should
> think about how to extend that too, but I think the proposed syntax was
> overly verbose for what we actually support right now. Let's discuss that as
> a separate thread, as well.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Heikki,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')

2017-04-05 Thread Noah Misch

On Fri, Mar 17, 2017 at 10:10:59AM +0900, Michael Paquier wrote:
> On Fri, Mar 17, 2017 at 2:30 AM, Jeff Janes  wrote:
> > On Thu, Mar 9, 2017 at 4:59 AM, Michael Paquier 
> > wrote:
> >>
> >> On Thu, Mar 9, 2017 at 1:17 AM, Joe Conway  wrote:
> >> > On 03/07/2017 08:29 PM, Tom Lane wrote:
> >> >> Michael Paquier  writes:
> >> >>> here is a separate thread dedicated to the following extension for
> >> >>> CREATE/ALTER ROLE: PASSWORD ('value' USING 'method').
> >> >>
> >> >> The parentheses seem weird ... do we really need those?
> >> >
> >> > +1
> >>
> >> Seeing 3 opinions in favor of that, let's do so then. I have updated
> >> the patch to not use parenthesis.
> >
> > The regression tests only exercise the CREATE ROLE...USING version, not the
> > ALTER ROLE...USING version.
> 
> Done.
> 
> > +and plain for an non-hashed password.  If the password
> > +string is already in MD5-hashed or SCRAM-hashed, then it is
> > +stored hashed as-is.
> >
> > In the last line, I think "stored as-is" sounds better.
> 
> Okay.
> 
> > Other than that, it looks good to me.
> 
> Thanks for the review. Attached is an updated patch.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Heikki,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast Default WIP patch for discussion

2017-04-05 Thread Serge Rielau

Andres,
Yes, I still want to push this in. However I have not had time to get back to 
it. I’m embarrassed to say that I don’t even know where the comments that were 
issued occurred.
Cheers Serge

via Newton Mail 
[https://cloudmagic.com/k/d/mailapp?ct=dx=9.4.52=10.11.6=email_footer_2]
On Wed, Apr 5, 2017 at 4:47 PM, Andres Freund  wrote:
Hi Serge,

On 2016-10-28 08:28:11 -0700, Serge Rielau wrote:
> Time for me to dig into that then.

Are you planning to update your POC at some point? This'd be a very
welcome improvement.

Regards,

Andres

Re: [HACKERS] Changing references of password encryption to hashing

2017-04-05 Thread Noah Misch

On Thu, Mar 16, 2017 at 07:27:11AM -0700, Joe Conway wrote:
> On 03/16/2017 06:19 AM, Robert Haas wrote:
> > On Mon, Mar 13, 2017 at 4:48 AM, Craig Ringer  wrote:
> >> So I'm in favour of fixing the docs but I'm not keen on changing the
> >> SQL syntax in a way that just kind of papers over part of the
> >> problems.
> > 
> > I agree.  I think that trying to design new SQL syntax at this point
> > is unlikely to be a good idea - we're just about out of time here, and
> > some people who might care about this are busy on other things, and
> > the deadline for patches that do new things has long since passed.
> > But I like the idea of trying to improve the documentation.

Seems like a good direction.

> Agreed. I think the documentation fixes definitely should be done, but
> understand that the grammar is a longer term issue with backward
> compatibility implications. Acknowledging the problem is the first step ;-)

Most of the user-visible (including doc/) proposed changes alter material
predating v10.  Therefore, I've removed this from open item status.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multiple false-positive warnings from Valgrind

2017-04-05 Thread Noah Misch

On Wed, Mar 29, 2017 at 12:34:52PM +0900, Michael Paquier wrote:
> On Thu, Mar 23, 2017 at 5:15 PM, Michael Paquier
>  wrote:
> > On Tue, Mar 21, 2017 at 10:57 PM, Aleksander Alekseev
> >  wrote:
> >> Recently I've decided to run PostgreSQL under Valgrind according to wiki
> >> description [1]. Lots of warnings are generated [2] but it is my
> >> understanding that all of them are false-positive. For instance I've
> >> found these two reports particularly interesting:
> >>
> >> ```
> >> ==00:00:40:40.161 7677== Use of uninitialised value of size 8
> >> ==00:00:40:40.161 7677==at 0xA15FF5: pg_b64_encode (base64.c:68)
> >> ==00:00:40:40.161 7677==by 0x6FFE85: scram_build_verifier 
> >> (auth-scram.c:348)
> >> ==00:00:40:40.161 7677==by 0x6F3F76: encrypt_password (crypt.c:171)
> >> ==00:00:40:40.161 7677==by 0x68F40C: CreateRole (user.c:403)
> >> ==00:00:40:40.161 7677==by 0x85D53A: standard_ProcessUtility 
> >> (utility.c:716)
> >> ==00:00:40:40.161 7677==by 0x85CCC7: ProcessUtility (utility.c:353)
> >> ==00:00:40:40.161 7677==by 0x85BD22: PortalRunUtility (pquery.c:1165)
> >> ==00:00:40:40.161 7677==by 0x85BF20: PortalRunMulti (pquery.c:1308)
> >> ==00:00:40:40.161 7677==by 0x85B4A0: PortalRun (pquery.c:788)
> >> ==00:00:40:40.161 7677==by 0x855672: exec_simple_query 
> >> (postgres.c:1101)
> >> ==00:00:40:40.161 7677==by 0x8597BB: PostgresMain (postgres.c:4066)
> >> ==00:00:40:40.161 7677==by 0x7C6322: BackendRun (postmaster.c:4317)
> >> ==00:00:40:40.161 7677==  Uninitialised value was created by a stack 
> >> allocation
> >> ==00:00:40:40.161 7677==at 0x6FFDB7: scram_build_verifier 
> >> (auth-scram.c:328)
> >
> > I can see those warnings as well after calling a code path of
> > scram_build_verifier(), and I have a hard time seeing that as nothing
> > else than a false positive as you do. All those warnings go away if
> > you just initialize just do MemSet(salt, 0, SCRAM_SALT_LEN) before
> > calling pg_backend_random() but this data is filled in with
> > RAND_bytes() afterwards (if built with openssl). The estimated lengths
> > of the encoding are also correct. I don't see immediately what's wrong
> > here, this deserves a second look...
> 
> And it seems to me that this is caused by the routines of OpenSSL.
> When building without --with-openssl, using the fallback
> implementations of SHA256 and RAND_bytes I see no warnings generated
> by scram_build_verifier... I think it makes most sense to discard that
> from the list of open items.

This defect has roughly the gravity of a compiler warning.  Dropped from open
items on that basis.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-05 Thread Amit Langote

On 2017/04/06 13:08, Keith Fiske wrote:
> On Wed, Apr 5, 2017 at 2:51 PM, Keith Fiske wrote:
>> Only issue I see with this, and I'm not sure if it is an issue, is what
>> happens to that default constraint clause when 1000s of partitions start
>> getting added? From what I gather the default's constraint is built based
>> off the cumulative opposite of all other child constraints. I don't
>> understand the code well enough to see what it's actually doing, but if
>> there are no gaps, is the method used smart enough to aggregate all the
>> child constraints to make a simpler constraint that is simply outside the
>> current min/max boundaries? If so, for serial/time range partitioning this
>> should typically work out fine since there are rarely gaps. This actually
>> seems more of an issue for list partitioning where each child is a distinct
>> value or range of values that are completely arbitrary. Won't that check
>> and re-evaluation of the default's constraint just get worse and worse as
>> more children are added? Is there really even a need for the default to
>> have an opposite constraint like this? Not sure on how the planner works
>> with partitioning now, but wouldn't it be better to first check all
>> non-default children for a match the same as it does now without a default
>> and, failing that, then route to the default if one is declared? The
>> default should accept any data then so I don't see the need for the
>> constraint unless it's required for the current implementation. If that's
>> the case, could that be changed?

Unless I misread your last sentence, I think there might be some
confusion.  Currently, the partition constraint (think of these as you
would of user-defined check constraints) is needed for two reasons: 1. to
prevent direct insertion of rows into the default partition for which a
non-default partition exists; no two partitions should ever have duplicate
rows.  2. so that planner can use the constraint to determine if the
default partition needs to be scanned for a query using constraint
exclusion; no need, for example, to scan the default partition if the
query requests only key=3 rows and a partition for the same exists (no
other partition should have key=3 rows by definition, not even the
default).  As things stand today, planner needs to look at every partition
individually for using constraint exclusion to possibly exclude it, *even*
with declarative partitioning and that would include the default partition.

> Actually, thinking on this more, I realized this does again come back to
> the lack of a global index. Without the constraint, data could be put
> directly into the default that could technically conflict with the
> partition scheme elsewhere. Perhaps, instead of the constraint, inserts
> directly to the default could be prevented on the user level. Writing to
> valid children directly certainly has its place, but been thinking about
> it, and I can't see any reason why one would ever want to write directly to
> the default. It's use case seems to be around being a sort of temporary
> storage until that data can be moved to a valid location. Would still need
> to allow removal of data, though.

As mentioned above, the default partition will not allow directly
inserting a row whose key maps to some existing (non-default) partition.

As far as tuple-routing is concerned, it will choose the default partition
only if no other partition is found for the key.  Tuple-routing doesn't
use the partition constraints directly per se, like one of the two things
mentioned above do.  One could say that tuple-routing assigns the incoming
rows to partitions such that their individual partition constraints are
not violated.

Finally, we don't yet offer global guarantees for constraints like unique.
 The only guarantee that's in place is that no two partitions can contain
the same partition key.

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-05 Thread Rushabh Lathia

On 2017/04/06 0:19, Robert Haas wrote:
> On Wed, Apr 5, 2017 at 5:57 AM, Rahila Syed 
wrote:
>>> Could you briefly elaborate why you think the lack global index support
>>> would be a problem in this regard?
>> I think following can happen if we allow rows satisfying the new
partition
>> to lie around in the
>> default partition until background process moves it.
>> Consider a scenario where partition key is a primary key and the data in
the
>> default partition is
>> not yet moved into the newly added partition. If now, new data is added
into
>> the new partition
>> which also exists(same key) in default partition there will be data
>> duplication. If now
>> we scan the partitioned table for that key(from both the default and new
>> partition as we
>> have not moved the rows) it will fetch the both rows.
>> Unless we have global indexes for partitioned tables, there is chance of
>> data duplication between
>> child table added after default partition and the default partition.
>
> Yes, I think it would be completely crazy to try to migrate the data
> in the background:
>
> - The migration might never complete because of a UNIQUE or CHECK
> constraint on the partition to which rows are being migrated.
>
> - Even if the migration eventually succeeded, such a design abandons
> all hope of making INSERT .. ON CONFLICT DO NOTHING work sensibly
> while the migration is in progress, unless the new partition has no
> UNIQUE constraints.
>
> - Partition-wise join and partition-wise aggregate would need to have
> special case handling for the case of an unfinished migration, as
> would any user code that accesses partitions directly.
>
> - More generally, I think users expect that when a DDL command
> finishes execution, it's done all of the work that there is to do (or
> at the very least, that any remaining work has no user-visible
> consequences, which would not be the case here).

Thanks Robert for this explanation. This makes it more clear, why row
movement by background is not sensible idea.

On Thu, Apr 6, 2017 at 9:38 AM, Keith Fiske  wrote:

>
> On Wed, Apr 5, 2017 at 2:51 PM, Keith Fiske  wrote:
>
>>
>>
>> Only issue I see with this, and I'm not sure if it is an issue, is what
>> happens to that default constraint clause when 1000s of partitions start
>> getting added? From what I gather the default's constraint is built based
>> off the cumulative opposite of all other child constraints. I don't
>> understand the code well enough to see what it's actually doing, but if
>> there are no gaps, is the method used smart enough to aggregate all the
>> child constraints to make a simpler constraint that is simply outside the
>> current min/max boundaries? If so, for serial/time range partitioning this
>> should typically work out fine since there are rarely gaps. This actually
>> seems more of an issue for list partitioning where each child is a distinct
>> value or range of values that are completely arbitrary. Won't that check
>> and re-evaluation of the default's constraint just get worse and worse as
>> more children are added? Is there really even a need for the default to
>> have an opposite constraint like this? Not sure on how the planner works
>> with partitioning now, but wouldn't it be better to first check all
>> non-default children for a match the same as it does now without a default
>> and, failing that, then route to the default if one is declared? The
>> default should accept any data then so I don't see the need for the
>> constraint unless it's required for the current implementation. If that's
>> the case, could that be changed?
>>
>> Keith
>>
>
> Actually, thinking on this more, I realized this does again come back to
> the lack of a global index. Without the constraint, data could be put
> directly into the default that could technically conflict with the
> partition scheme elsewhere. Perhaps, instead of the constraint, inserts
> directly to the default could be prevented on the user level. Writing to
> valid children directly certainly has its place, but been thinking about
> it, and I can't see any reason why one would ever want to write directly to
> the default. It's use case seems to be around being a sort of temporary
> storage until that data can be moved to a valid location. Would still need
> to allow removal of data, though.
>
> Not sure if that's even a workable solution. Just trying to think of ways
> around the current limitations and still allow this feature.
>

I like the idea about having DEFAULT partition for the range partition.
With the
way partition is designed it can have holes into range partition. I think
DEFAULT
for the range partition is a good idea, generally when the range having
holes. When
range is serial then of course DEFAULT partition doen't much sense.

Regarda,

Rushabh Lathia
www.EnterpriseDB.com

Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.

2017-04-05 Thread Ashutosh Sharma

Hi,

>>
>> Based on the earlier discussions, I have prepared a patch that would
>> allow pgstathashindex() to show the number of unused pages in hash
>> index. Please find the attached patch for the same. Thanks.
>
> My idea is that we shouldn't end up with both a zero_pages column and
> an unused_pages column.  Instead, we should end up with just an
> unused_pages column, which will include both pages that are all-zeroes
> and pages that have a valid special space marked as LH_UNUSED.
>
> Also, I don't see why it's correct to test PageIsEmpty() here instead
> of just testing the page type as we do in pageinspect.
>
> Attached is a revised patch that shows what I have in mind; please
> review.  Along the way I made the code for examining the page type
> more similar to what pageinspect does, because I see no reason for
> those things to be different, and I think the pageinspect code is
> better.

I have reviewed the patch and it looks good to me. Also, the idea of
including both zero and unused pages in a single 'unused' column looks
better. Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Letting the client choose the protocol to use during a SASL exchange

2017-04-05 Thread Noah Misch

On Tue, Apr 04, 2017 at 03:02:30PM +0900, Michael Paquier wrote:
> There is still one open item pending for SCRAM that has not been
> treated which is mentioned here:
> https://www.postgresql.org/message-id/b081887e-1712-3aa4-7dbe-e012333d5...@iki.fi
> 
> When doing an authentication with SASL, the server decides what is the
> mechanism that the client has to use. As SCRAM-SHA-256 is only one of
> such mechanisms, it would be nice to have something more generic and
> have the server return to the client a list of protocols that the
> client can choose from. And also the server achnowledge which protocol
> is going to be used.
> 
> Note that RFC4422 has some content on the matter
> https://tools.ietf.org/html/rfc4422#section-3.1:
>Mechanism negotiation is protocol specific.
> 
>Commonly, a protocol will specify that the server advertises
>supported and available mechanisms to the client via some facility
>provided by the protocol, and the client will then select the "best"
>mechanism from this list that it supports and finds suitable.
> 
> So once the server sends back the list of mechanisms that are
> supported, the client is free to use what it wants.
> 
> On HEAD, a 'R' message with AUTH_REQ_SASL followed by
> SCRAM_SHA256_NAME is sent to let the client know what is the mechanism
> to use for the SASL exchange. In the future, this should be extended
> so as a list of names is sent, for example a comma-separated list, but
> we are free to choose the format we want here. With this list at hand,
> the client can then choose the protocol it thinks is the best. Still,
> there is a gap with our current implementation because the server
> expects the first message from the client to have a SCRAM format, but
> that's true only if SCRAM-SHA-256 is used as mechanism.
> 
> In order to cover this gap, it seems to me that we need to have an
> intermediate state before the server is switched to FE_SCRAM_INIT so
> as the mechanism used is negotiated between the two parties. Once the
> protocol negotiation is done, the server can then move on with the
> mechanism to use. This would be important in the future to allow more
> SASL mechanisms to work. I am adding an open item for that.

If any SCRAM open item is a beta blocker, it's this one.  (But SASLprep is
also in or near that status.)  Post-beta wire protocol changes are bad,
considering beta is normally the time for projects like pgjdbc and npgsql to
start adapting to such changes.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Heikki,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace

2017-04-05 Thread Kyotaro HORIGUCHI

At Thu, 06 Apr 2017 00:59:49 +0200, Pierre Ducroquet  
wrote in <2008148.rxBNyNRHPZ@peanuts2>
> But it all gets messy when we want to create a streaming standby server using 
> pg_basebackup. When backuping Pg 9.5, there is no issue, but backuping Pg 9.6 
> afterwards will say "directory "/mnt/ssd/postgres" exists but is not empty".

The documentation says as follows.

https://www.postgresql.org/docs/9.6/static/manage-ag-tablespaces.html

| The location must be an existing, empty directory that is owned
| by the PostgreSQL operating system user.

This explicitly prohibits to share one tablespace directory among
multiple servers. The code is just missing the case of multiple
servers with different versions. I think the bug is rather that
Pg9.6 in the case allowed to create the tablespace.

The current naming rule of tablespace directory was introduced as
of 9.0 so that pg_upgrade (or pg_migrator at the time) can
perform in-place migration. It is not intended to share a
directory among multiple instances with different versions.

That being said, an additional trick in the attached file will
work for you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



--- reproduce-bug-tblspace.sh	2017-04-06 13:49:10.657803383 +0900
+++ reproduce-bug-tblspace.sh.new	2017-04-06 13:48:45.870024586 +0900
@@ -48,3 +48,14 @@
 echo " SECOND BASEBACKUP "
-echo "Expecting it to fail with 'directory \"/tmp/pg_bug_backup_tblspace/dest_server/tblspc\" exists but is not empty'"
+
+echo "stashing tablespace directories of the first backup"
+cd ..
+mv tblspc tblspc.tmp
+mkdir tblspc
+cd datas
+
 $PG2/pg_basebackup -h /tmp -p $PORT2 -D pg2 --tablespace-mapping=/tmp/pg_bug_backup_tblspace/source_server/tblspc=/tmp/pg_bug_backup_tblspace/dest_server/tblspc
+
+echo "merging tablespace directories"
+cd ..
+mv tblspc.tmp/* tblspc/
+rmdir tblspc.tmp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel bitmapscan isn't exercised in regression tests

2017-04-05 Thread Dilip Kumar

On Tue, Apr 4, 2017 at 5:51 AM, Dilip Kumar  wrote:
> Sure I can do that, In attached patch, I only fixed the problem of not
> executing the bitmap test.  Now, I will add few cases to cover other
> parts especially rescan and prefetching logic.

I have added two test cases to cover rescan, prefetch and lossy pages
logic for parallel bitmap.  I have removed the existing case because
these two new cases will be enough to cover that part as well.

Now, nodeBitmapHeapScan.c has 95.5% of line coverage.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

parallel_bitmap_test_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-04-05 Thread Jim Nasby


On 4/5/17 9:08 PM, Craig Ringer wrote:

... which I can't reproduce now. Even though I cleared ccache and "git
reset -fdx" before I ran the above and got the crash.


Glad to hear that, since I can't repro at all. :)


Assume it's a local system peculiarity. If I can reproduce again I'll
dig into it.


Sounds good. Thanks!
--
Jim Nasby, Chief Data Architect, Austin TX
OpenSCG http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-05 Thread Keith Fiske

On Wed, Apr 5, 2017 at 2:51 PM, Keith Fiske  wrote:

>
>
> Only issue I see with this, and I'm not sure if it is an issue, is what
> happens to that default constraint clause when 1000s of partitions start
> getting added? From what I gather the default's constraint is built based
> off the cumulative opposite of all other child constraints. I don't
> understand the code well enough to see what it's actually doing, but if
> there are no gaps, is the method used smart enough to aggregate all the
> child constraints to make a simpler constraint that is simply outside the
> current min/max boundaries? If so, for serial/time range partitioning this
> should typically work out fine since there are rarely gaps. This actually
> seems more of an issue for list partitioning where each child is a distinct
> value or range of values that are completely arbitrary. Won't that check
> and re-evaluation of the default's constraint just get worse and worse as
> more children are added? Is there really even a need for the default to
> have an opposite constraint like this? Not sure on how the planner works
> with partitioning now, but wouldn't it be better to first check all
> non-default children for a match the same as it does now without a default
> and, failing that, then route to the default if one is declared? The
> default should accept any data then so I don't see the need for the
> constraint unless it's required for the current implementation. If that's
> the case, could that be changed?
>
> Keith
>

Actually, thinking on this more, I realized this does again come back to
the lack of a global index. Without the constraint, data could be put
directly into the default that could technically conflict with the
partition scheme elsewhere. Perhaps, instead of the constraint, inserts
directly to the default could be prevented on the user level. Writing to
valid children directly certainly has its place, but been thinking about
it, and I can't see any reason why one would ever want to write directly to
the default. It's use case seems to be around being a sort of temporary
storage until that data can be moved to a valid location. Would still need
to allow removal of data, though.

Not sure if that's even a workable solution. Just trying to think of ways
around the current limitations and still allow this feature.

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-04-05 Thread Craig Ringer

On 6 April 2017 at 11:46, Craig Ringer  wrote:
> On 6 April 2017 at 09:40, Jim Nasby  wrote:
>> On 4/4/17 7:44 PM, Craig Ringer wrote:
>>>
>>> The patch crashes in initdb with --enable-cassert builds:
>>
>>
>> Thanks for the review! I'll get to the rest of it in a bit, but I'm unable
>> to reproduce the initdb failure. I looked at the assert line and I don't see
>> anything obvious either. :/
>>
>> Can you send your full configure call? uname -a? Mine is:
>>
>> ./configure --with-includes=/opt/local/include
>> --with-libraries=/opt/local/lib --enable-debug --with-libxml --with-tcl
>> --with-perl --with-python --enable-depend --enable-dtrace --enable-tap-tests
>> --prefix=/Users/decibel/pgsql/HEAD/i/i --with-pgport=$PGC_PORT -C
>> --enable-cassert --enable-debug CFLAGS='-ggdb -O0 -fno-omit-frame-pointer'
>
>
> ./configure --prefix=/home/craig/pg/10 --enable-cassert --enable-debug
> --enable-tap-tests --with-python
> make -s clean
> make -s -j4
> make check
>
> results in the crash here.

... which I can't reproduce now. Even though I cleared ccache and "git
reset -fdx" before I ran the above and got the crash.

Assume it's a local system peculiarity. If I can reproduce again I'll
dig into it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-04-05 Thread Jim Nasby


On 4/5/17 7:44 PM, Jim Nasby wrote:

Updated patches attached, but I still need to update the docs.


Attached is a complete series of patches that includes the docs patch.

Right now, the docs don't include a concrete example, because adding one 
would be a pretty large if it demonstrated real usage, which presumably 
means Yet Another Contrib Module strictly for the purpose of 
demonstrating something. Rather than doing that, ISTM it'd be better to 
point the user at what plpythonu is doing.


Another option would be to have a very simple example that only uses 
*receiveSlot, but that seems rather pointless to me.

--
Jim Nasby, Chief Data Architect, Austin TX
OpenSCG http://OpenSCG.com
From 0a2ef661f55a047763a43b0eebd7483760e4a427 Mon Sep 17 00:00:00 2001
From: Jim Nasby 
Date: Wed, 5 Apr 2017 20:52:39 -0500
Subject: [PATCH 1/3] Add SPI_execute_callback

Instead of placing results in a tuplestore, this method of execution
uses the supplied callback when creating the Portal for a query.
---
 src/backend/executor/spi.c | 80 --
 src/backend/tcop/dest.c| 15 +
 src/include/executor/spi.h |  4 +++
 src/include/tcop/dest.h|  1 +
 4 files changed, 90 insertions(+), 10 deletions(-)

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index ca547dc6d9..4f6c3011f9 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, 
SPIPlanPtr plan);
 
 static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  Snapshot snapshot, Snapshot 
crosscheck_snapshot,
- bool read_only, bool fire_triggers, uint64 
tcount);
+ bool read_only, bool fire_triggers, uint64 
tcount,
+ DestReceiver *callback);
 
 static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
@@ -321,7 +322,35 @@ SPI_execute(const char *src, bool read_only, long tcount)
 
res = _SPI_execute_plan(, NULL,
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+
+int
+SPI_execute_callback(const char *src, bool read_only, long tcount,
+   DestReceiver *callback)
+{
+   _SPI_plan   plan;
+   int res;
+
+   if (src == NULL || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   memset(, 0, sizeof(_SPI_plan));
+   plan.magic = _SPI_PLAN_MAGIC;
+   plan.cursor_options = 0;
+
+   _SPI_prepare_oneshot_plan(src, );
+
+   res = _SPI_execute_plan(, NULL,
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);
return res;
@@ -355,7 +384,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const 
char *Nulls,

_SPI_convert_params(plan->nargs, plan->argtypes,

Values, Nulls),
InvalidSnapshot, 
InvalidSnapshot,
-   read_only, true, 
tcount);
+   read_only, true, 
tcount, NULL);
+
+   _SPI_end_call(true);
+   return res;
+}
+
+/* Execute a previously prepared plan with a callback */
+int
+SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls,
+bool read_only, long tcount, DestReceiver 
*callback)
+{
+   int res;
+
+   if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0)
+   return SPI_ERROR_ARGUMENT;
+
+   if (plan->nargs > 0 && Values == NULL)
+   return SPI_ERROR_PARAM;
+
+   res = _SPI_begin_call(true);
+   if (res < 0)
+   return res;
+
+   res = _SPI_execute_plan(plan,
+   
_SPI_convert_params(plan->nargs, plan->argtypes,
+   
Values, Nulls),
+   InvalidSnapshot, 
InvalidSnapshot,
+   read_only, true, 
tcount, callback);
 
_SPI_end_call(true);

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-04-05 Thread Craig Ringer

On 6 April 2017 at 09:40, Jim Nasby  wrote:
> On 4/4/17 7:44 PM, Craig Ringer wrote:
>>
>> The patch crashes in initdb with --enable-cassert builds:
>
>
> Thanks for the review! I'll get to the rest of it in a bit, but I'm unable
> to reproduce the initdb failure. I looked at the assert line and I don't see
> anything obvious either. :/
>
> Can you send your full configure call? uname -a? Mine is:
>
> ./configure --with-includes=/opt/local/include
> --with-libraries=/opt/local/lib --enable-debug --with-libxml --with-tcl
> --with-perl --with-python --enable-depend --enable-dtrace --enable-tap-tests
> --prefix=/Users/decibel/pgsql/HEAD/i/i --with-pgport=$PGC_PORT -C
> --enable-cassert --enable-debug CFLAGS='-ggdb -O0 -fno-omit-frame-pointer'


./configure --prefix=/home/craig/pg/10 --enable-cassert --enable-debug
--enable-tap-tests --with-python
make -s clean
make -s -j4
make check

results in the crash here.

if I add

CFLAGS=""

to the arguments (which suppresses the default "-O2"), or pass

CFLAGS="-O0"

then the crash goes away.



$ python --version
Python 2.7.11

$ lsb_release -a
LSB Version: 
:core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: Fedora
Description: Fedora release 23 (Twenty Three)
Release: 23
Codename: TwentyThree

$ gcc --version
gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-04-05 Thread Jim Nasby


On 4/4/17 7:44 PM, Craig Ringer wrote:

On 5 April 2017 at 08:23, Craig Ringer  wrote:

On 5 April 2017 at 08:00, Craig Ringer  wrote:

This patch fails to update the documentation at all.

https://www.postgresql.org/docs/devel/static/spi.html


I'll fix that soon.


missing newline


Fixed.


+/* Execute a previously prepared plan with a callback Destination */


caps "Destination"


Hmm, I capitalized it since DestReceiver is capitalized. I guess it's 
best to just drop Destination.



+// XXX throw error if callback is set


Fixed (opted to just use an Assert).


+static DestReceiver spi_callbackDR = {
+donothingReceive, donothingStartup, donothingCleanup, donothingCleanup,
+DestSPICallback
+};
Presumably that's a default destination you're then supposed to modify
with your own callbacks? There aren't any comments to indicate what's
going on here.


Correct. Added the following comment:


/*
 * This is strictly a starting point for creating a callback. It should not
 * actually be used.
 */





Comments on patch 2:

If this is the "bare minimum" patch, what is pending? Does it impose
any downsides or limits?


No limits. I'm not aware of any downsides.

It's "bare minimum" because a follow-on step is to allow different 
methods of returning results. In particular, my testing indicates that 
patch 1 + returning a dict of lists (as opposed to the current list of 
dicts) results in a 460% improvement vs ~30% with patch 2.



+/* Get switch execution contexts */
+extern PLyExecutionContext
*PLy_switch_execution_context(PLyExecutionContext *new);

Comment makes no sense to me. This seems something like memory context
switch, where you supply the new and return the old. But the comment
doesn't explain it at all.


Comment changed to
/* Switch execution context (similar to MemoryContextSwitchTo */


+void PLy_CSStartup(DestReceiver *self, int operation, TupleDesc typeinfo);
+void PLy_CSDestroy(DestReceiver *self);

These are declared in the plpy_spi.c. Why aren't these static?


Derp. Fixed.


+volatile MemoryContext oldcontext;
+volatile ResourceOwner oldowner;
 int rv;
-volatile MemoryContext oldcontext;
-volatile ResourceOwner oldowner;

Unnecessary code movement.


IMHO it reads better that way. I've left it for now so $COMMITTER can 
decide, but if it really bugs you let me know and I'll swap it.



In PLy_Callback_New, I think your use of a sub-context is sensible. Is
it necessary to palloc0 though?


Hrm, maybe not... but it seems like cheap insurance considering the 
amount of other stuff involved in just starting a new SPI call. And 
honestly, I'd rather not mess with it at this point. :) I have added an 
XXX comment though.



+static CallbackState *
+PLy_Callback_Free(CallbackState *callback)

The code here looks like it could be part of spi.c's callback support,
rather than plpy_spi specific, since you provide a destroy callback in
the SPI callback struct.


I added this for use in PG_CATCH() blocks, because it's not clear to me 
that the portal gets cleaned up in those cases. It's certainly possible 
that it's pointless.


FWIW, I'm pretty sure I copied that pattern from somewhere else... 
probably plpgsql or pltcl.



+/* We need to store this because the TupleDesc the receive
function gets has no names. */
+myState->desc = typeinfo;

Is it safe to just store the pointer to the TupleDesc here? What's the lifetime?


Only Portal lifetime.


+ * will clean it up when the time is right. XXX This might result in a leak
+ * if an error happens and the result doesn't get dereferenced.
+ */
+MemoryContextSwitchTo(TopMemoryContext);
+result->tupdesc = CreateTupleDescCopy(typeinfo);

especially given this XXX comment...


I've changed the comment to the following. Hopefully this clarifies things:


/*
 * Save tuple descriptor for later use by result set metadata
 * functions.  Save it in TopMemoryContext so that it survives outside 
of
 * an SPI context.  We trust that PLy_result_dealloc() will clean it up
 * when the time is right. The difference between result and everything
 * else is that result needs to survive after the portal is destroyed,
 * because result is what's handed back to the plpython function. While
 * it's tempting to use something other than TopMemoryContext, that 
won't
 * work: the user could potentially put result into the global 
dictionary,
 * which means it could survive as long as the session does.  This might
 * result in a leak if an error happens and the result doesn't get
 * dereferenced, but if that happens it means the python GC has failed 
us,
 * at which point we probably have bigger problems.
 *
 * This still isn't perfect though; if something the result tupledesc
 * references has it's OID changed then the tupledesc

[HACKERS] Interval for launching the table sync worker

2017-04-05 Thread Masahiko Sawada

Hi all,

While testing table sync worker for logical replication I noticed that
if the table sync worker of logical replication failed to insert the
data for whatever reason, the table sync worker process exits with
error. And then the main apply worker launches the table sync worker
again soon without interval. This routine is executed at very high
frequency without interval.

Should we do put a interval (wal_retrieve_interval or make a new GUC
parameter?) for launching the table sync worker?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-04-05 Thread Andres Freund

On 2017-04-05 14:52:38 +0530, Amit Khandekar wrote:
> This is what the earlier versions of my patch had done : just add up
> per-subplan parallel_workers (1 for non-partial subplan and
> subpath->parallel_workers for partial subplans) and set this total as
> the Append parallel_workers.

I don't think that's great, consider e.g. the case that you have one
very expensive query, and a bunch of cheaper ones. Most of those workers
wouldn't do much while waiting for the the expensive query.  What I'm
basically thinking we should do is something like the following
pythonesque pseudocode:

best_nonpartial_cost = -1
best_nonpartial_nworkers = -1

for numworkers in 1...#max workers:
   worker_work = [0 for x in range(0, numworkers)]

   nonpartial_cost += startup_cost * numworkers

   # distribute all nonpartial tasks over workers.  Assign tasks to the
   # worker with the least amount of work already performed.
   for task in all_nonpartial_subqueries:
   least_busy_worker = worker_work.smallest()
   least_busy_worker += task.total_nonpartial_cost

   # the nonpartial cost here is the largest amount any single worker
   # has to perform.
   nonpartial_cost += worker_work.largest()

   total_partial_cost = 0
   for task in all_partial_subqueries:
   total_partial_cost += task.total_nonpartial_cost

   # Compute resources needed by partial tasks. First compute how much
   # cost we can distribute to workers that take shorter than the
   # "busiest" worker doing non-partial tasks.
   remaining_avail_work = 0
   for i in range(0, numworkers):
   remaining_avail_work += worker_work.largest() - worker_work[i]

   # Equally divide up remaining work over all workers
   if remaining_avail_work < total_partial_cost:
  nonpartial_cost += (worker_work.largest - remaining_avail_work) / 
numworkers

   # check if this is the best number of workers
   if best_nonpartial_cost == -1 or best_nonpartial_cost > nonpartial_cost:
  best_nonpartial_cost = worker_work.largest
  best_nonpartial_nworkers = nworkers

Does that make sense?

> BTW all of the above points apply only for non-partial plans.

Indeed. But I think that's going to be a pretty common type of plan,
especially if we get partitionwise joins.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

2017-04-05 Thread Noah Misch

On Wed, Apr 05, 2017 at 12:16:25AM -0700, Andres Freund wrote:
> On 2017-04-05 02:47:55 -0400, Noah Misch wrote:
> > On Fri, Mar 10, 2017 at 11:49:46AM -0800, Andres Freund wrote:
> > > On 2017-03-09 13:34:22 -0500, Robert Haas wrote:
> > > > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane  wrote:
> > > > > Andres Freund  writes:
> > > > >> Wonder if we there's an argument to be made for implementing this
> > > > >> roughly similarly to split_pathtarget_at_srf - instead of injecting a
> > > > >> ProjectSet node we'd add a FunctionScan node below a Result node.
> > > > >
> > > > > Yeah, possibly.  That would have the advantage of avoiding an 
> > > > > ExecProject
> > > > > step when the SRFs aren't buried, which would certainly be the 
> > > > > expected
> > > > > case.
> > > > >
> > > > > If you don't want to make ExecInitExpr responsible, then the planner 
> > > > > would
> > > > > have to do something like split_pathtarget_at_srf anyway to decompose 
> > > > > the
> > > > > expressions, no matter which executor representation we use.
> > > > 
> > > > Did we do anything about this?  Are we going to?
> > > 
> > > Working on a patch.
> > 
> > [Action required within three days.  This is a generic notification.]
> > 
> > The above-described topic is currently a PostgreSQL 10 open item.  Andres,
> > since you committed the patch believed to have created it, you own this open
> > item.  If some other commit is more relevant or if this does not belong as a
> > v10 open item, please let us know.  Otherwise, please observe the policy on
> > open item ownership[1] and send a status update within three calendar days 
> > of
> > this message.  Include a date for your subsequent status update.  Testers 
> > may
> > discover new open items at any time, and I want to plan to get them all 
> > fixed
> > well in advance of shipping v10.  Consequently, I will appreciate your 
> > efforts
> > toward speedy resolution.  Thanks.
> 
> I've a very preliminary patch.  I'd like to only start polishing it up
> once the code freeze is over, so I can work on getting some patches in -
> note that I myself have no pending patches.  Once frozen I'll polish it
> up and send that within a few days.
> 
> Ok?

Okay; using my simplistic translator of "a few", I'll look for your next
status update on or before 2017-04-11.  As always, feel free to set a
different date.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-05 Thread Noah Misch

On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:
> On Wed, Apr 5, 2017 at 3:45 PM, Noah Misch  wrote:
> > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
> >> Regarding this feature, there are some loose ends. We should work on
> >> and complete them until the release.
> >>
> >> (1)
> >> Which synchronous replication method, priority or quorum, should be
> >> chosen when neither FIRST nor ANY is specified in s_s_names? Right now,
> >> a priority-based sync replication is chosen for keeping backward
> >> compatibility. However some hackers argued to change this decision
> >> so that a quorum commit is chosen because they think that most users
> >> prefer to a quorum.
> >>
> >> (2)
> >> There will be still many source comments and documentations that
> >> we need to update, for example, in high-availability.sgml. We need to
> >> check and update them throughly.
> >>
> >> (3)
> >> The priority value is assigned to each standby listed in s_s_names
> >> even in quorum commit though those priority values are not used at all.
> >> Users can see those priority values in pg_stat_replication.
> >> Isn't this confusing? If yes, it might be better to always assign 1 as
> >> the priority, for example.
> >
> > [Action required within three days.  This is a generic notification.]
> >
> > The above-described topic is currently a PostgreSQL 10 open item.  Fujii,
> > since you committed the patch believed to have created it, you own this open
> > item.  If some other commit is more relevant or if this does not belong as a
> > v10 open item, please let us know.  Otherwise, please observe the policy on
> > open item ownership[1] and send a status update within three calendar days 
> > of
> > this message.  Include a date for your subsequent status update.  Testers 
> > may
> > discover new open items at any time, and I want to plan to get them all 
> > fixed
> > well in advance of shipping v10.  Consequently, I will appreciate your 
> > efforts
> > toward speedy resolution.  Thanks.
> >
> > [1] 
> > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> 
> Thanks for the notice!
> 
> Regarding the item (2), Sawada-san told me that he will work on it after
> this CommitFest finishes. So we would receive the patch for the item from
> him next week. If there will be no patch even after the end of next week
> (i.e., April 14th), I will. Let's wait for Sawada-san's action at first.

Sounds reasonable; I will look for your update on 14Apr or earlier.

> The items (1) and (3) are not bugs. So I don't think that they need to be
> resolved before the beta release. After the feature freeze, many users
> will try and play with many new features including quorum-based syncrep.
> Then if many of them complain about (1) and (3), we can change the code
> at that timing. So we need more time that users can try the feature.

I've moved (1) to a new section for things to revisit during beta.  If someone
feels strongly that the current behavior is Wrong and must change, speak up as
soon as you reach that conclusion.  Absent such arguments, the behavior won't
change.

> BTW, IMO (3) should be fixed so that pg_stat_replication reports NULL
> as the priority if quorum-based sync rep is chosen. It's less confusing.

Since you do want (3) to change, please own it like any other open item,
including the mandatory status updates.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Faster methods for getting SPI results (460% improvement)

2017-04-05 Thread Jim Nasby


On 4/4/17 7:44 PM, Craig Ringer wrote:

The patch crashes in initdb with --enable-cassert builds:


Thanks for the review! I'll get to the rest of it in a bit, but I'm 
unable to reproduce the initdb failure. I looked at the assert line and 
I don't see anything obvious either. :/


Can you send your full configure call? uname -a? Mine is:

./configure --with-includes=/opt/local/include 
--with-libraries=/opt/local/lib --enable-debug --with-libxml --with-tcl 
--with-perl --with-python --enable-depend --enable-dtrace 
--enable-tap-tests --prefix=/Users/decibel/pgsql/HEAD/i/i 
--with-pgport=$PGC_PORT -C --enable-cassert --enable-debug CFLAGS='-ggdb 
-O0 -fno-omit-frame-pointer'


Darwin decina.local 15.6.0 Darwin Kernel Version 15.6.0: Mon Jan  9 
23:07:29 PST 2017; root:xnu-3248.60.11.2.1~1/RELEASE_X86_64 x86_64

--
Jim Nasby, Chief Data Architect, Austin TX
OpenSCG http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-05 Thread Pavan Deolasee

On Thu, Apr 6, 2017 at 1:06 AM, Robert Haas  wrote:

> On Wed, Apr 5, 2017 at 2:32 PM, Pavan Deolasee 
> wrote:
>
> > The other way is to pass old tuple values along with the new tuple
> values to
> > amwarminsert, build index tuples and then do a comparison. For duplicate
> > index tuples, skip WARM inserts.
>
> This is more what I was thinking.  But maybe one of the other ideas
> you wrote here is better; not sure.
>
>
Ok. I think I suggested this as one of the ideas upthread, to support hash
indexes for example. This might be a good safety-net, but AFAIC what we
have today should work since we pretty much construct index tuples in a
consistent way before doing a comparison.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-04-05 Thread Stephen Frost

Andres,

* Andres Freund (and...@anarazel.de) wrote:
> On 2017-04-05 21:07:59 -0400, Stephen Frost wrote:
> > * Andres Freund (and...@anarazel.de) wrote:
> > > I don't yet have a good idea how to deal with moving individual cells
> > > into files, so they can be loaded.  One approach would be to to have
> > > something like
> > > 
> > > \storequeryresult filename_template.%row.%column
> > > 
> > > which'd then print the current query buffer into the relevant file after
> > > doing replacement on %row and %column.
> > 
> > I don't actually agree that there's a problem having the output from a
> > query stored direclty in binary form into a single file.  The above
> > approach seems to imply that every binary result must go into an
> > independent file, and perhaps that would be useful in some cases, but I
> > don't see it as required.
> 
> Well, it'd not be enforced - it'd depend on your template.  But for a
> lot of types of files, it'd not make sense to store multiple
> columns/rows in file. Particularly for ones where printing them out to
> files is actually meaningful (i.e. binary ones).

Having the template not require the row/column place-holders included
strikes me as more likely to be confusing.  My initial thinking around
this was that users who actually want independent files would simply
issue independent queries, while users who want to take a bunch of int4
columns and dump them into a single binary file would be able to do so
easily.

I'm not against adding the ability for a single query result to be saved
into independent files, but it strikes me as feature creep on this basic
capability.  Further, I don't see any particular reason why splitting up
the output from a query into multiple files is only relevant for binary
data.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Logical Replication and Character encoding

2017-04-05 Thread Kyotaro HORIGUCHI

At Wed, 5 Apr 2017 11:33:51 -0400, Peter Eisentraut 
 wrote in 
<5401fef6-c0c0-7e8a-d8b1-169e30cbd...@2ndquadrant.com>
> After further thinking, I prefer the alternative approach of using
> pq_sendcountedtext() as is and sticking the trailing zero byte on on the
> receiving side.  This is a more localized change, and keeps the logical
> replication protocol consistent with the main FE/BE protocol.  (Also, we
> don't need to send a useless byte around.)

I'm not sure about the significance of the trailing zero in the
the logical replication protocol. Anyway the patch works.

> Patch attached, and also a test case.

The problem was revealed when a string is shortened by encoding
conversion. The test covers the situation.

- The patches appliy on the master cleanly.
- The patch works for the UTF-8 => EUC_JP case.
- The test seems proper.


By the way, an untranslatable character on the publisher table
stops walsender with the following error.

> ERROR:  character with byte sequence 0xe6 0xbc 0xa2 in encoding "UTF8" has no 
> equivalent in encoding "LATIN1"
> STATEMENT:  COPY public.t TO STDOUT
> LOG:  could not send data to client: Broken pipe
> FATAL:  connection to client lost

walreceiver stops on the opposite side with the following
complaint.

> ERROR:  could not receive data from WAL stream: ERROR:  character with byte 
> sequence 0xe6 0xbc 0xa2 in encoding "UTF8" has no equivalent in encoding 
> "LATIN1"
> CONTEXT:  COPY t, line 1: ""
> LOG:  worker process: logical replication worker for subscription 16391 sync 
> 16384 (PID 26915) exited with exit code 1

After this, walreceiver repeats reconnecting to master with no
wait. Maybe walreceiver had better refrain from reconnection
after certain kinds of faiure but it is not an urgent issue.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test

2017-04-05 Thread Noah Misch

On Wed, Apr 05, 2017 at 11:13:33AM -0400, Stephen Frost wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Stephen Frost  writes:
> > > I'm all for adding tests into pg_dump which do things like drop columns
> > > and change column names and other cases which could impact if the
> > > pg_dump is correct or not, and there's nothing preventing those tests
> > > from being added in the existing structure.  Certainly, before we remove
> > > the coverage provided by running the serial test suite and then using
> > > pg_upgrade, we should analyze what is being tested and ensure that we're
> > > providing that same set of testing in the pg_dump TAP tests.
> > 
> > I don't think you grasped my basic point, which is that I'm worried about
> > emergent cases that we don't foresee needing to test (and that no amount
> > of code coverage checking would have shown up as being overlooked).
> > Admittedly, relying on the core regression tests to trigger such cases is
> > a pretty haphazard strategy, but it's way better than no strategy at all.
> 
> The tests that were added to the core regression suite were done so for
> a reason and hopefully we can identify cases where it'd make sense to
> also run those tests for pg_upgrade/pg_dump testing.

I think you _are_ missing Tom's point.  We've caught pg_dump and pg_upgrade
bugs thanks to regression database objects created for purposes unrelated to
pg_dump.  It's true that there exist other test strategies that are more
efficient or catch more bugs overall.  None of them substitute 100% for the
serendipity seen in testing dump/restore on the regression database.

> More-or-less
> anything that materially changes the catalog should be included, I would
> think.  Things that are only really only working with the heap/index
> files don't really need to be performed because the pg_upgrade process
> doesn't change those.

That is formally true.


Also, I agree with Andres that this is not a thread for discussing test
changes beyond mechanical translation of the pg_upgrade test suite.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-04-05 Thread Robert Haas

On Wed, Apr 5, 2017 at 8:17 PM, Neha Khatri  wrote:
> The problem here seem to be the change in the max_parallel_workers value
> while the parallel workers are still under execution. So this poses two
> questions:
>
> 1. From usecase point of view, why could there be a need to tweak the
> max_parallel_workers exactly at the time when the parallel workers are at
> play.
> 2. Could there be a restriction on tweaking of max_parallel_workers while
> the parallel workers are at play? At least do not allow setting the
> max_parallel_workers less than the current # of active parallel workers.

Well, that would be letting the tail wag the dog.  The maximum value
of max_parallel_workers is only 1024, and what we're really worried
about here is seeing a value near PG_UINT32_MAX, which leaves a lot of
daylight.  How about just creating a #define that's used by guc.c as
the maximum for the GUC, and here we assert that we're <= that value?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-04-05 Thread Andres Freund

Hi,

On 2017-04-05 21:07:59 -0400, Stephen Frost wrote:
> * Andres Freund (and...@anarazel.de) wrote:
> > I don't like the API here much.  Loading requires knowledge of some
> > magic $1 value and allows only a single column, printing doesn't mean
> > much when there's multiple columns / rows.
> 
> > I think the loading side of things should be redesigned into a more
> > general facility for providing query parameters.  E.g. something like
> > \setparam $1 'whateva'
> > \setparamfromfile $2 'somefile'
> > \setparamfromprogram $3 cat /frakbar
> >
> > which then would get used in the next query sent to the server.  That'd
> > allow importing multiple columns, and it'd be useful for other purposes
> > than just loading binary data.
> 
> I tend to agree that the loading side should probably be thought through
> more.
> 
> > I don't yet have a good idea how to deal with moving individual cells
> > into files, so they can be loaded.  One approach would be to to have
> > something like
> > 
> > \storequeryresult filename_template.%row.%column
> > 
> > which'd then print the current query buffer into the relevant file after
> > doing replacement on %row and %column.
> 
> I don't actually agree that there's a problem having the output from a
> query stored direclty in binary form into a single file.  The above
> approach seems to imply that every binary result must go into an
> independent file, and perhaps that would be useful in some cases, but I
> don't see it as required.

Well, it'd not be enforced - it'd depend on your template.  But for a
lot of types of files, it'd not make sense to store multiple
columns/rows in file. Particularly for ones where printing them out to
files is actually meaningful (i.e. binary ones).

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-05 Thread Amit Langote

On 2017/04/06 0:19, Robert Haas wrote:
> On Wed, Apr 5, 2017 at 5:57 AM, Rahila Syed  wrote:
>>> Could you briefly elaborate why you think the lack global index support
>>> would be a problem in this regard?
>> I think following can happen if we allow rows satisfying the new partition
>> to lie around in the
>> default partition until background process moves it.
>> Consider a scenario where partition key is a primary key and the data in the
>> default partition is
>> not yet moved into the newly added partition. If now, new data is added into
>> the new partition
>> which also exists(same key) in default partition there will be data
>> duplication. If now
>> we scan the partitioned table for that key(from both the default and new
>> partition as we
>> have not moved the rows) it will fetch the both rows.
>> Unless we have global indexes for partitioned tables, there is chance of
>> data duplication between
>> child table added after default partition and the default partition.
> 
> Yes, I think it would be completely crazy to try to migrate the data
> in the background:
> 
> - The migration might never complete because of a UNIQUE or CHECK
> constraint on the partition to which rows are being migrated.
> 
> - Even if the migration eventually succeeded, such a design abandons
> all hope of making INSERT .. ON CONFLICT DO NOTHING work sensibly
> while the migration is in progress, unless the new partition has no
> UNIQUE constraints.
> 
> - Partition-wise join and partition-wise aggregate would need to have
> special case handling for the case of an unfinished migration, as
> would any user code that accesses partitions directly.
> 
> - More generally, I think users expect that when a DDL command
> finishes execution, it's done all of the work that there is to do (or
> at the very least, that any remaining work has no user-visible
> consequences, which would not be the case here).

OK, I realize the background migration was a poorly thought out idea.  And
a *first* version that will handle the row-movement should be doing that
as part of the same command anyway.

> IMV, the question of whether we have efficient ways to move data
> around between partitions is somewhat separate from the question of
> whether partitions can be defined in a certain way in the first place.
> The problems that Keith refers to upthread already exist for
> subpartitioning; you've got to detach the old partition, create a new
> one, and then reinsert the data.  And for partitioning an
> unpartitioned table: create a replacement table, insert all the data,
> substitute it for the original table.  The fact that we have these
> limitation is not good, but we're not going to rip out partitioning
> entirely because we don't have clever ways of migrating the data in
> those cases, and the proposed behavior here is not any worse.
>
> Also, waiting for those problems to get fixed might be waiting for
> Godot.  I'm not really all that sanguine about our chances of coming
> up with a really nice way of handling these cases.  In a designed
> based on table inheritance, you can leave it murky where certain data
> is supposed to end up and migrate it on-line and you might get away
> with that, but a major point of having declarative partitioning at all
> is to remove that sort of murkiness.  It's probably not that hard to
> come up with a command that locks the parent and moves data around via
> full table scans, but I'm not sure how far that really gets us; you
> could do the same thing easily enough with a sequence of commands
> generated via a script.  And being able to do this in a general way
> without a full table lock looks pretty hard - it doesn't seem
> fundamentally different from trying to perform a table-rewriting
> operation like CLUSTER without a full table lock, which we also don't
> support.  The executor is not built to cope with any aspect of the
> table definition shifting under it, and that includes the set of child
> tables with are partitions of the table mentioned in the query.  Maybe
> the executor can be taught to survive such definitional changes at
> least in limited cases, but that's a much different project than
> allowing default partitions.

Agreed.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-column dependency stats

2017-04-05 Thread David Rowley

On 6 April 2017 at 13:05, David Rowley  wrote:
> I tested with the attached, and it does not seem to hurt planner
> performance executing:

Here's it again, this time with a comment on the
find_relation_from_clauses() function.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


find_relation_from_clauses_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-04-05 Thread Stephen Frost

Andres,

* Andres Freund (and...@anarazel.de) wrote:
> I don't like the API here much.  Loading requires knowledge of some
> magic $1 value and allows only a single column, printing doesn't mean
> much when there's multiple columns / rows.

> I think the loading side of things should be redesigned into a more
> general facility for providing query parameters.  E.g. something like
> \setparam $1 'whateva'
> \setparamfromfile $2 'somefile'
> \setparamfromprogram $3 cat /frakbar
>
> which then would get used in the next query sent to the server.  That'd
> allow importing multiple columns, and it'd be useful for other purposes
> than just loading binary data.

I tend to agree that the loading side should probably be thought through
more.

> I don't yet have a good idea how to deal with moving individual cells
> into files, so they can be loaded.  One approach would be to to have
> something like
> 
> \storequeryresult filename_template.%row.%column
> 
> which'd then print the current query buffer into the relevant file after
> doing replacement on %row and %column.

I don't actually agree that there's a problem having the output from a
query stored direclty in binary form into a single file.  The above
approach seems to imply that every binary result must go into an
independent file, and perhaps that would be useful in some cases, but I
don't see it as required.

> I don't think we can find an API we agree upon in the next 48h...

Now that there's more than one opinion being voiced on the API, I tend
to agree with this.  Hopefully we can keep the discussion moving
forward, however, as I do see value in this capability in general.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-04-05 Thread Robert Haas

On Wed, Apr 5, 2017 at 2:42 AM, Ashutosh Bapat
 wrote:
> Only inner join conditions have equivalence classes associated with
> those. Outer join conditions create single element equivalence
> classes. So, we can not associate equivalence classes as they are with
> partition scheme. If we could do that, it makes life much easier since
> checking whether equi-join between all partition keys exist, is simply
> looking up equivalence classes that cover joining relations and find
> em_member corresponding to partition keys.

OK.

> It looks like we should only keep strategy, partnatts, partopfamily
> and parttypcoll in PartitionScheme. A partition-wise join between two
> relations would be possible if all those match.

Yes, I think so. Conceivably you could even exclude partnatts and
strategy, since there's nothing preventing a partitionwise join
between a list-partitioned table and a range-partitioned table, or
between a table range-partitioned on (a) and another range-partitioned
on (a, b), but there is probably not much benefit in trying to cover
such cases.  I think it's reasonable to tell users that this is only
going to work when the partitioning strategy is the same and the join
conditions include all of the partitioning columns on both sides.

> There's a relevant comment in 0006, build_joinrel_partition_info()
> (probably that name needs to change, but I will do that once we have
> settled on design)
> +   /*
> +* Construct partition keys for the join.
> +*
> +* An INNER join between two partitioned relations is partition by key
> +* expressions from both the relations. For tables A and B
> partitioned by a and b
> +* respectively, (A INNER JOIN B ON A.a = B.b) is partitioned by both A.a
> +* and B.b.
> +*
> +* An OUTER join like (A LEFT JOIN B ON A.a = B.b) may produce rows with
> +* B.b NULL. These rows may not fit the partitioning conditions imposed on
> +* B.b. Hence, strictly speaking, the join is not partitioned by B.b.
> +* Strictly speaking, partition keys of an OUTER join should include
> +* partition key expressions from the OUTER side only. Consider a join 
> like
> +* (A LEFT JOIN B on (A.a = B.b) LEFT JOIN C ON B.b = C.c. If we do not
> +* include B.b as partition key expression for (AB), it prohibits us from
> +* using partition-wise join when joining (AB) with C as there is no
> +* equi-join between partition keys of joining relations. But two NULL
> +* values are never equal and no two rows from mis-matching partitions can
> +* join. Hence it's safe to include B.b as partition key expression for
> +* (AB), even though rows in (AB) are not strictly partitioned by B.b.
> +*/
>
> I think that also needs to be reviewed carefully.

The following passage from src/backend/optimizer/README seems highly relevant:

===
The planner's treatment of outer join reordering is based on the following
identities:

1.  (A leftjoin B on (Pab)) innerjoin C on (Pac)
= (A innerjoin C on (Pac)) leftjoin B on (Pab)

where Pac is a predicate referencing A and C, etc (in this case, clearly
Pac cannot reference B, or the transformation is nonsensical).

2.  (A leftjoin B on (Pab)) leftjoin C on (Pac)
= (A leftjoin C on (Pac)) leftjoin B on (Pab)

3.  (A leftjoin B on (Pab)) leftjoin C on (Pbc)
= A leftjoin (B leftjoin C on (Pbc)) on (Pab)

Identity 3 only holds if predicate Pbc must fail for all-null B rows
(that is, Pbc is strict for at least one column of B).  If Pbc is not
strict, the first form might produce some rows with nonnull C columns
where the second form would make those entries null.
===

In other words, I think your statement that null is never equal to
null is a bit imprecise.  Somebody could certainly create an operator
that is named "=" which returns true in that case, and then they could
say, hey, two nulls are equal (when you use that operator).  The
argument needs to be made in terms of the formal properties of the
operator.  The relevant logic is in have_partkey_equi_join:

+   /* Skip clauses which are not equality conditions. */
+   if (rinfo->hashjoinoperator == InvalidOid &&
!rinfo->mergeopfamilies)
+   continue;

Actually, I think the hashjoinoperator test is formally and
practically unnecessary here; lower down there is a test to see
whether the partitioning scheme's operator family is a member of
rinfo->mergeopfamilies, which will certainly fail if we got through
this test with rinfo->mergeopfamilies == NIL just on the strength of
rinfo->hashjoinoperator != InvalidOid.  So you can just bail out if
rinfo->mergeopfamilies == NIL.  But the underlying point here is that
the only thing you really know about the function is that it's got to
be a strategy-3 operator in some btree opclass; if that guarantees
strictness, then so be it -- but I wasn't able to find anything in the
code or documentation

Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-column dependency stats

2017-04-05 Thread David Rowley

On 6 April 2017 at 11:33, Tom Lane  wrote:
> David Rowley  writes:
>> On 6 April 2017 at 10:48, Tom Lane  wrote:
>>> The buildfarm is unhappy about the fact that this changed the API
>>> for clauselist_selectivity().  I am not convinced that that change
>>> was a good idea, so before telling FDW authors that they need to
>>> change their code, I'd like to hear a defense of the API change.
>
>> Because varReliId is often passed in as 0, and that meant we'd have to
>> write some code to check of the clause was made up of RestrictInfos
>> from a single relation or not, and look for extended stats on that
>> singleton rel.
>
> Generally, if it's passed as zero, that's a good clue that the clause
> *is* a join clause.  In any case, this defense fails to address my
> other question, which is what's going to happen to this API when you
> want to use extended stats in join-clause estimates, which I'd expect
> to surely happen before very long.
>
> Also, I find it hard to believe that a bms_get_singleton_member call is
> going to be material in comparison to all the work that will be invoked
> indirectly via whatever selectivity estimation function gets called for
> each clause.  Even a single catcache fetch would swamp that.
>
> So no, you have not convinced me that this isn't a broken design.
>
>> FWIW, I found this function being called 72 times in a 5 way join
>> search problem.
>
> And you measured the overhead of doing it the other way to be ... ?
> Premature optimization and all that.

I tested with the attached, and it does not seem to hurt planner
performance executing:

explain select * from ab ab1
inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b
inner join ab ab3 on ab1.a = ab3.a and ab1.b = ab3.b
inner join ab ab4 on ab1.a = ab4.a and ab1.b = ab4.b
inner join ab ab5 on ab1.a = ab5.a and ab1.b = ab5.b
inner join ab ab6 on ab1.a = ab6.a and ab1.b = ab6.b
inner join ab ab7 on ab1.a = ab7.a and ab1.b = ab7.b
inner join ab ab8 on ab1.a = ab8.a and ab1.b = ab8.b;

after having executed:

create table ab (a int, b int);

I get:

find_relation_from_clauses
tps = 48.992918 (excluding connections establishing)
tps = 49.060407 (excluding connections establishing)
tps = 49.075815 (excluding connections establishing)

Master

tps = 48.938027 (excluding connections establishing)
tps = 48.066274 (excluding connections establishing)
tps = 48.727089 (excluding connections establishing)

running pgbench -n -T 60 -f 8wayjoin.sql


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


find_relation_from_clauses.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-04-05 Thread Andres Freund

On 2017-03-18 17:51:48 +0100, Pavel Stehule wrote:
> What is done:
> 
> create table foo foo(a bytea);
> 
> -- import
> insert into foo values($1)
> \gloadfrom ~/xxx.jpg bytea
> 
> -- export
> \pset format binary
> select a from foo
> \g ~/xxx2.jpg
> 
> tested on import 55MB binary file
> 
> Comments, notes?

I don't like the API here much.  Loading requires knowledge of some
magic $1 value and allows only a single column, printing doesn't mean
much when there's multiple columns / rows.

I think the loading side of things should be redesigned into a more
general facility for providing query parameters.  E.g. something like
\setparam $1 'whateva'
\setparamfromfile $2 'somefile'
\setparamfromprogram $3 cat /frakbar

which then would get used in the next query sent to the server.  That'd
allow importing multiple columns, and it'd be useful for other purposes
than just loading binary data.

I don't yet have a good idea how to deal with moving individual cells
into files, so they can be loaded.  One approach would be to to have
something like

\storequeryresult filename_template.%row.%column

which'd then print the current query buffer into the relevant file after
doing replacement on %row and %column.

I don't think we can find an API we agree upon in the next 48h...

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2017-04-05 Thread Tatsuo Ishii

> Well, personally, as an all-ASCII guy I'm not too fussed about that,
> but I can see that other people might be annoyed.
> 
> The main problem in dealing with it seems to be whether you're willing
> to support pgbench running in non-backend-safe encodings (eg SJIS).
> If we rejected that case then it'd be a relatively simple change to allow
> pgbench to treat any high-bit-set byte as a valid variable name character.
> (I think anyway, haven't checked the code.)
> 
> Although ... actually, psql allows any high-bit-set byte in variable
> names (cf valid_variable_name()) without concern about encoding.
> That means it's formally incorrect in SJIS, but it's been like that
> for an awful lot of years and nobody's complained.  Maybe it'd be fine
> for pgbench to act the same.

That's my thought too.

> Having said all that, I think we're at the point in the commitfest
> where if there's any design question at all about a patch, it should
> get booted to the next cycle.  We are going to have more than enough
> to do to stabilize what's already committed, we don't need to be
> adding more uncertainty.

Ok, I will move the patch to the next cf.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2017-04-05 Thread Andres Freund

On 2017-04-05 20:24:19 -0400, Tom Lane wrote:
> Having said all that, I think we're at the point in the commitfest
> where if there's any design question at all about a patch, it should
> get booted to the next cycle.  We are going to have more than enough
> to do to stabilize what's already committed, we don't need to be
> adding more uncertainty.

+1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2017-04-05 Thread Tom Lane

Tatsuo Ishii  writes:
> I still wonder whether I should commit this or not because this patch
> does not allow none ascii column names.

Well, personally, as an all-ASCII guy I'm not too fussed about that,
but I can see that other people might be annoyed.

The main problem in dealing with it seems to be whether you're willing
to support pgbench running in non-backend-safe encodings (eg SJIS).
If we rejected that case then it'd be a relatively simple change to allow
pgbench to treat any high-bit-set byte as a valid variable name character.
(I think anyway, haven't checked the code.)

Although ... actually, psql allows any high-bit-set byte in variable
names (cf valid_variable_name()) without concern about encoding.
That means it's formally incorrect in SJIS, but it's been like that
for an awful lot of years and nobody's complained.  Maybe it'd be fine
for pgbench to act the same.

Having said all that, I think we're at the point in the commitfest
where if there's any design question at all about a patch, it should
get booted to the next cycle.  We are going to have more than enough
to do to stabilize what's already committed, we don't need to be
adding more uncertainty.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] strange parallel query behavior after OOM crashes

2017-04-05 Thread Neha Khatri

On Wed, Apr 5, 2017 at 5:34 PM, Kuntal Ghosh 
 wrote:

> On Tue, Apr 4, 2017 at 12:16 PM, Neha Khatri 
> wrote:
>
> > I feel there should be an assert if
> > (BackgroundWorkerData->parallel_register_count -
> > BackgroundWorkerData->parallel_terminate_count) > max_parallel_workers)
> >
> Backend 1 > SET max_parallel_worker = 8;
> Backend 1 > Execute a long running parallel query q1 with number of
> parallel worker spawned is say, 4.
> Backend 2> SET max_parallel_worker = 3;
> Backend 2 > Execute a long running parallel query q2 with number of
> parallel worker spawned > 0.
>
> The above assert statement will bring down the server unnecessarily
> while executing q2.

Right, with multiple backends trying to fiddle with max_parallel_workers,
that might bring the server down with the said assert:
Assert(parallel_register_count - parallel_terminate_count <=
max_parallel_workers)

The problem here seem to be the change in the max_parallel_workers value while
the parallel workers are still under execution. So this poses two questions:

1. From usecase point of view, why could there be a need to tweak the
max_parallel_workers exactly at the time when the parallel workers are at
play.
2. Could there be a restriction on tweaking of max_parallel_workers while
the parallel workers are at play? At least do not allow setting the
max_parallel_workers less than the current # of active parallel workers.

Regards,
Neha

Re: [HACKERS] pgbench - allow to store select results into variables

2017-04-05 Thread Tatsuo Ishii

Tom and others,

I still wonder whether I should commit this or not because this patch
does not allow none ascii column names. We know pgbench variable name
has been restricted since the functionality was born. When users
explicitly define a pgbench variable using \set, it is not a too
strong limitation, because it's in a "pgbench world" anyway and
nothing is related to PostgreSQL core specs. However, \gset is not
happy with perfectly valid column names in PostgreSQL core, which
looks too inconsistent and confusing for users.

So the choices are:

1) commit the patch now with documenting the limitation.
   (the patch looks good to me except the issue above)

2) move it to next cf hoping that someone starts the implementation to
eliminate the limitation of none ascii variable names.

Comments?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Hello Tatsuo-san,
> 
>> It seems the new feature \gset doesn't work with tables having none
>> ascii column names:
> 
> Indeed. The same error is triggered with the \set syntax, which does
> not involve any query execution.
> 
> I have added a sentence mentionning the restriction when variables are
> first discussed in the documentation, see attached patch.
> 
> -- 
> Fabien.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fast Default WIP patch for discussion

2017-04-05 Thread Andres Freund

Hi Serge,

On 2016-10-28 08:28:11 -0700, Serge Rielau wrote:
> Time for me to dig into that then.

Are you planning to update your POC at some point? This'd be a very
welcome improvement.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench more operators & functions

2017-04-05 Thread Andres Freund

Hi,

On 2017-03-16 12:21:31 -0400, David Steele wrote:
> Any reviewers want to have a look?

Unfortunately there wasn't much of that.  I think that this patch
atm doesn't have sufficient design agreement to be considered for
v10.  As the current CF has formally ended, I think we'll have to punt
this to v11.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-column dependency stats

2017-04-05 Thread Tom Lane

David Rowley  writes:
> On 6 April 2017 at 10:48, Tom Lane  wrote:
>> The buildfarm is unhappy about the fact that this changed the API
>> for clauselist_selectivity().  I am not convinced that that change
>> was a good idea, so before telling FDW authors that they need to
>> change their code, I'd like to hear a defense of the API change.

> Because varReliId is often passed in as 0, and that meant we'd have to
> write some code to check of the clause was made up of RestrictInfos
> from a single relation or not, and look for extended stats on that
> singleton rel.

Generally, if it's passed as zero, that's a good clue that the clause
*is* a join clause.  In any case, this defense fails to address my
other question, which is what's going to happen to this API when you
want to use extended stats in join-clause estimates, which I'd expect
to surely happen before very long.

Also, I find it hard to believe that a bms_get_singleton_member call is
going to be material in comparison to all the work that will be invoked
indirectly via whatever selectivity estimation function gets called for
each clause.  Even a single catcache fetch would swamp that.

So no, you have not convinced me that this isn't a broken design.

> FWIW, I found this function being called 72 times in a 5 way join
> search problem.

And you measured the overhead of doing it the other way to be ... ?
Premature optimization and all that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LWLock optimization for multicore Power machines

2017-04-05 Thread Tom Lane

Andres Freund  writes:
> On 2017-04-03 11:56:13 -0700, Andres Freund wrote:
>> Have you done x86 benchmarking?

> I think unless such benchmarking is done in the next 24h we need to move
> this patch to the next CF...

In theory, inlining the _impl function should lead to exactly the same
assembly code as before, on all non-PPC platforms.  Confirming that would
be a good substitute for benchmarking.

Having said that, I'm not sure this patch is solid enough to push in
right now, and would be happy with seeing it pushed to v11.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-04-05 Thread Andres Freund

Hi,

On 2017-03-30 14:11:28 +0200, Peter Moser wrote:
> 2017-03-01 10:56 GMT+01:00 Peter Moser :
> > A similar walkthrough for ALIGN will follow soon.
> >
> > We are thankful for any suggestion or ideas, to be used to write a
> > good SGML documentation.
> 
> The attached README explains the ALIGN operation step-by-step with a
> TEMPORAL LEFT OUTER JOIN example. That is, we start from a query
> input, show how we rewrite it during parser stage, and show how the
> final execution generates result tuples.

Unfortunately I don't think this patch has received sufficient design
and implementation to consider merging it into v10.  As code freeze is
in two days, I think we'll have to move this to the next commitfest.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: recursive json_populate_record()

2017-04-05 Thread Andrew Dunstan



On 04/04/2017 05:18 PM, Andrew Dunstan wrote:
>
> On 04/03/2017 05:17 PM, Andres Freund wrote:
>> On 2017-03-21 14:31:08 -0400, Andrew Dunstan wrote:
>>> On 03/21/2017 01:37 PM, David Steele wrote:
 On 3/16/17 11:54 AM, David Steele wrote:
> On 2/1/17 12:53 AM, Michael Paquier wrote:
>> On Thu, Jan 26, 2017 at 6:49 AM, Tom Lane  wrote:
>>> Nikita Glukhov  writes:
 On 25.01.2017 23:58, Tom Lane wrote:
> I think you need to take a second look at the code you're producing
> and realize that it's not so clean either.  This extract from
> populate_record_field, for example, is pretty horrid:
 But what if we introduce some helper macros like this:
 #define JsValueIsNull(jsv) \
  ((jsv)->is_json ? !(jsv)->val.json.str \
  : !(jsv)->val.jsonb || (jsv)->val.jsonb->type == jbvNull)
 #define JsValueIsString(jsv) \
  ((jsv)->is_json ? (jsv)->val.json.type == JSON_TOKEN_STRING \
  : (jsv)->val.jsonb && (jsv)->val.jsonb->type == jbvString)
>>> Yeah, I was wondering about that too.  I'm not sure that you can make
>>> a reasonable set of helper macros that will fix this, but if you want
>>> to try, go for it.
>>>
>>> BTW, just as a stylistic thing, I find "a?b:c||d" unreadable: I have
>>> to go back to the manual every darn time to convince myself whether
>>> that means (a?b:c)||d or a?b:(c||d).  It's better not to rely on
>>> the reader (... or the author) having memorized C's precedence rules
>>> in quite that much detail.  Extra parens help.
>> Moved to CF 2017-03 as discussion is going on and more review is
>> needed on the last set of patches.
>>
> The current patches do not apply cleanly at cccbdde:
>
> $ git apply ../other/0001-introduce-JsonContainerXxx-macros-v04.patch
> error: patch failed: src/backend/utils/adt/jsonb_util.c:328
> error: src/backend/utils/adt/jsonb_util.c: patch does not apply
> error: patch failed: src/backend/utils/adt/jsonfuncs.c:1266
> error: src/backend/utils/adt/jsonfuncs.c: patch does not apply
> error: patch failed: src/include/utils/jsonb.h:218
> error: src/include/utils/jsonb.h: patch does not apply
>
> In addition, it appears a number of suggestions have been made by Tom
> that warrant new patches.  Marked "Waiting on Author".
 This thread has been idle for months since Tom's review.

 The submission has been marked "Returned with Feedback".  Please feel
 free to resubmit to a future commitfest.


>>> Please revive. I am planning to look at this later this week.
>> Since that was 10 days ago - you're still planning to get this in before
>> the freeze?
>>
>
> Hoping to, other demands have intervened a bit, but I might be able to.
>



OK, I have been through this and I think it's basically committable. I
am currently doing some cleanup, such as putting all the typedefs in one
place, fixing redundant comments and adding more comments, declaring all
the static functions, and minor code changes, but nothing major. I
should be ready to commit tomorrow some time.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-column dependency stats

2017-04-05 Thread David Rowley

On 6 April 2017 at 10:48, Tom Lane  wrote:
> Simon Riggs  writes:
>> Collect and use multi-column dependency stats
>
> The buildfarm is unhappy about the fact that this changed the API
> for clauselist_selectivity().  I am not convinced that that change
> was a good idea, so before telling FDW authors that they need to
> change their code, I'd like to hear a defense of the API change.
> Why not just use the existing varRelid parameter for that?  Why
> is there an assumption that only one rel's extended stats will
> ever be of interest?  This function does get used for join clauses.

Because varReliId is often passed in as 0, and that meant we'd have to
write some code to check of the clause was made up of RestrictInfos
from a single relation or not, and look for extended stats on that
singleton rel. Given the number of times this function gets called
during planning, especially so with queries involving many joins, I
think it's best to not have to extract the correct relids each time. I
coded it that way to reduce the overhead during planning, which is
something that often comes up with planner patches.

FWIW, I found this function being called 72 times in a 5 way join
search problem.

Perhaps there is a nicer way to do this, I just couldn't think of it.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LWLock optimization for multicore Power machines

2017-04-05 Thread Andres Freund

Hi,

On 2017-04-03 11:56:13 -0700, Andres Freund wrote:
> 
> > +/*
> > + * Generic implementation of pg_atomic_fetch_mask_add_u32() via loop
> > + * of compare & exchange.
> > + */
> > +static inline uint32
> > +pg_atomic_fetch_mask_add_u32_impl(volatile pg_atomic_uint32 *ptr,
> > + uint32 mask_, 
> > uint32 add_)
> > +{
> > +   uint32  old_value;
> > +
> > +   /*
> > +* Read once outside the loop, later iterations will get the newer value
> > +* via compare & exchange.
> > +*/
> > +   old_value = pg_atomic_read_u32_impl(ptr);
> > +
> > +   /* loop until we've determined whether we could make an increment or 
> > not */
> > +   while (true)
> > +   {
> > +   uint32  desired_value;
> > +   boolfree;
> > +
> > +   desired_value = old_value;
> > +   free = (old_value & mask_) == 0;
> > +   if (free)
> > +   desired_value += add_;
> > +
> > +   /*
> > +* Attempt to swap in the value we are expecting. If we didn't 
> > see
> > +* masked bits to be clear, that's just the old value. If we 
> > saw them
> > +* as clear, we'll attempt to make an increment. The reason 
> > that we
> > +* always swap in the value is that this doubles as a memory 
> > barrier.
> > +* We could try to be smarter and only swap in values if we saw 
> > the
> > +* maked bits as clear, but benchmark haven't shown it as 
> > beneficial
> > +* so far.
> > +*
> > +* Retry if the value changed since we last looked at it.
> > +*/
> > +   if (pg_atomic_compare_exchange_u32_impl(ptr, _value, 
> > desired_value))
> > +   return old_value;
> > +   }
> > +   pg_unreachable();
> > +}
> > +#endif
> 
> Have you done x86 benchmarking?

I think unless such benchmarking is done in the next 24h we need to move
this patch to the next CF...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] C++ port of Postgres

2017-04-05 Thread Andres Freund

Hi Peter,

On 2017-02-28 22:30:16 -0800, Andres Freund wrote:
> On 2017-02-28 23:42:45 -0500, Peter Eisentraut wrote:
> > On 1/26/17 22:46, Andres Freund wrote:
> > > On 2016-09-30 15:24:09 -0400, Peter Eisentraut wrote:
> > >> Yeah, I have committed a few of the patches now and I'll close the CF
> > >> entry now.  Thanks for your research.
> > > 
> > > Are you planning to push more of these at some point?
> > 
> > Sure, let's take a look.
> [  partial review ]

Are you planning to get any of this into v10?  There's one or two
patches here that could qualify as bugfixes (the GinTernary stuff), but
the rest doesn't strike me as something that should be committed at the
tail end of the merge window. 

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-04-05 Thread Andres Freund

Hi,

On 2017-03-11 13:06:13 +0100, Pavel Stehule wrote:
> 2017-03-10 15:45 GMT+01:00 Alexander Korotkov :
> 
> > On Fri, Mar 10, 2017 at 5:10 PM, Peter Eisentraut <
> > peter.eisentr...@2ndquadrant.com> wrote:
> >
> >> On 2/24/17 16:32, Pavel Stehule wrote:
> >> > set EXTENDED_DESCRIBE_SORT size_desc
> >> > \dt+
> >> > \l+
> >> > \di+
> >> >
> >> > Possible variants: schema_table, table_schema, size_desc, size_asc
> >>
> >> I can see this being useful, but I think it needs to be organized a
> >> little better.
> >>
> >> Sort key and sort direction should be separate settings.
> >>
> >
> > I agree.
> >
> > I'm not sure why we need to have separate settings to sort by schema
> >> name and table name.
> >
> >
> > I think sorting by schema name, object name makes sense for people, who
> > have objects of same name in different schemas.
> >
> 
> I am sending a updated version with separated sort direction in special
> variable
> 
> There is a question. Has desc direction sense for columns like schema or
> table name?
> 
> Using desc, asc for size is natural. But for tablename?

I think it's pretty clear that we don't have sufficient agreement on the
design, not to speak of an implementation for an agreed upon design, to
get this into v10.  The patch also has been submitted late in the v10
cycle, and has received attention.  I'm therefore moving it to the next
commitfest.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] scram and \password

2017-04-05 Thread Michael Paquier

On Thu, Apr 6, 2017 at 2:11 AM, Heikki Linnakangas  wrote:
> At a quick glance, moving pg_frontend_random() to src/common looks like a
> non-starter. It uses pglock_thread() which is internal to libpq, so it won't
> compile as it is. I think I'm going to change scram_build_verifier() to take
> a pre-generated salt as argument, to avoid the need for a random number
> generator in src/common.

Oops. Need an updated set of patches?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: new set of psql patches for loading (saving) data from (to) text, binary files

2017-04-05 Thread Andres Freund

Hi,

On 2017-03-18 17:51:48 +0100, Pavel Stehule wrote:
> What is done:
> 
> create table foo foo(a bytea);
> 
> -- import
> insert into foo values($1)
> \gloadfrom ~/xxx.jpg bytea
> 
> -- export
> \pset format binary
> select a from foo
> \g ~/xxx2.jpg
> 
> tested on import 55MB binary file
> 
> Comments, notes?
> 
> Available import formats are limited to text, bytea, xml - these formats
> are safe for receiving data via recv function.

I don't think we have design agreement on this at this point.  Given the
upcoming code freeze, I think we'll have to hash this out during the
next development cycle.  Any counterarguments?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug with pg_basebackup and 'shared' tablespace

2017-04-05 Thread Pierre Ducroquet

Hi

On our servers, we are running different PostgreSQL versions because we can 
not migrate every application at the same time to Pg 9.6…
Since we have several disks, we use tablespaces and, for historical reasons, 
we used the same folder for both Pg versions, say /mnt/ssd/postgres
The server has absolutely no issue with that, there is not a single warning 
when running CREATE TABLESPACE.
But it all gets messy when we want to create a streaming standby server using 
pg_basebackup. When backuping Pg 9.5, there is no issue, but backuping Pg 9.6 
afterwards will say "directory "/mnt/ssd/postgres" exists but is not empty".
The attached script to this mail will easily reproduce that issue if you did 
not understand me. Just... read it before launching it, it's a dirty script to 
reproduce the issue. :)

I have looked a bit at the pg_basebackup code and the replication protocol, 
and I did not find any simple way to solve the issue. PostgreSQL use the 
CATALOG_VERSION_NO to have one folder per Pg version, but that constant is not 
accessible out of the PostgreSQL code and is not exposed in the replication 
protocol as described in the documentation 
https://www.postgresql.org/docs/current/static/protocol-replication.html
The only easy way I see to fix that issue is to alter the replication protocol 
to expose the CATALOG_VERSION_NO constant, making it possible for 
pg_basebackup to know the proper path to check. Another way would be to change 
the pg_basebackup logic and backup the main data folder in order to be able to 
read the pg_control file… but this seems ugly too.

I am willing to write the patch for this issue, but I would appreciate some 
help to find a proper way to fix it.

Thanks

 Pierre

reproduce-bug-tblspace.sh
Description: application/shellscript


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki

2017-04-05 Thread Andres Freund

Hi,

On 2017-04-02 22:28:40 +0200, Jan Michálek wrote:
> 2017-03-23 17:26 GMT+01:00 Pierre Ducroquet :
> > The new status of this patch is: Waiting on Author
> >
> 
> Corrected problem with \pset linestyle when format is set to markdown, or
> rst.
> 
> Corrected tuples only for markdown and rst (normal and expanded)

This patch still is undergoing development and review, and the code
freeze is closing in rapidly.  As this patch has been submitted just
before the last commitfest in the v10 cycle and has received plenty
feedback, I think it's fair to move this to the next commitfest.

Regards,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

2017-04-05 Thread Andres Freund

Hi,

On 2017-04-01 03:05:07 +0900, Masahiko Sawada wrote:
> On Fri, Mar 31, 2017 at 11:44 PM, Robert Haas  wrote:
> [ lots of valuable discussion ]

I think this patch clearly still is in the design stage, and has
received plenty feedback this CF.  I'll therefore move this to the next
commitfest.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-column dependency stats

2017-04-05 Thread Tom Lane

Simon Riggs  writes:
> Collect and use multi-column dependency stats

The buildfarm is unhappy about the fact that this changed the API
for clauselist_selectivity().  I am not convinced that that change
was a good idea, so before telling FDW authors that they need to
change their code, I'd like to hear a defense of the API change.
Why not just use the existing varRelid parameter for that?  Why
is there an assumption that only one rel's extended stats will
ever be of interest?  This function does get used for join clauses.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test

2017-04-05 Thread Michael Paquier

On Wed, Apr 5, 2017 at 10:24 PM, Stephen Frost  wrote:
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> 1) Initialize the old cluster and start it.
>> 2) create a bunch of databases with full range of ascii characters.
>> 3) Run regression tests.
>> 4) Take dump on old cluster.
>> 4) Stop the old cluster.
>> 5) Initialize the new cluster.
>> 6) Run pg_upgrade.
>> 7) Start new cluster.
>> 8) Take dump from it.
>> 9) Run deletion script (Oops forgot this part!)
>
> Presumably the check to match the old dump against the new one is also
> performed?

Yes. That's run with command_ok() at the end.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-04-05 Thread Andres Freund

Hi,

On 2017-04-05 17:00:42 +1000, Vaishnavi Prabakaran wrote:
> Regarding test patch, I have corrected the test suite after David Steele's
> comments.
> Also, I would like to mention that a companion patch was submitted by David
> Steele up-thread.
> 
> Attached the latest code and test patch.

My impression is that this'll need a couple more rounds of review. Given
that this'll establish API we'll pretty much ever going to be able to
change/remove, I think it'd be a bad idea to rush this into v10.
Therefore I propose moving this to the next CF.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-04-05 Thread Andres Freund

On 2017-03-13 18:45:00 +0530, Mithun Cy wrote:
> I have implemented a similar logic now. The prewarm bgworker will
> launch a sub-worker per database in the dump file. And, each
> sub-worker will load its database block info. The sub-workers will be
> launched only after previous one is finished. All of this will only
> start if the database has reached a consistent state.

Hm. For replay performance it'd possibly be good to start earlier,
before reaching consistency.  Is there an issue starting earlier?


> diff --git a/contrib/pg_prewarm/autoprewarm.c 
> b/contrib/pg_prewarm/autoprewarm.c
> new file mode 100644
> index 000..f4b34ca
> --- /dev/null
> +++ b/contrib/pg_prewarm/autoprewarm.c
> @@ -0,0 +1,1137 @@
> +/*-
> + *
> + * autoprewarm.c
> + *
> + * -- Automatically prewarm the shared buffer pool when server restarts.

Don't think we ususally use -- here.


> + *   Copyright (c) 2013-2017, PostgreSQL Global Development Group

Hm, that's a bit of a weird date range.


> + *   IDENTIFICATION
> + *   contrib/pg_prewarm.c/autoprewarm.c
> + *-
> + */

The pg_prewarm.c in there looks like some search & replace gone awry.



> +#include "postgres.h"
> +#include 
> +
> +/* These are always necessary for a bgworker. */
> +#include "miscadmin.h"
> +#include "postmaster/bgworker.h"
> +#include "storage/ipc.h"
> +#include "storage/latch.h"
> +#include "storage/lwlock.h"
> +#include "storage/proc.h"
> +#include "storage/shmem.h"
> +
> +/* These are necessary for prewarm utilities. */
> +#include "pgstat.h"
> +#include "storage/buf_internals.h"
> +#include "storage/smgr.h"
> +#include "utils/memutils.h"
> +#include "utils/resowner.h"
> +#include "utils/guc.h"
> +#include "catalog/pg_class.h"
> +#include "catalog/pg_type.h"
> +#include "executor/spi.h"
> +#include "access/xact.h"
> +#include "utils/rel.h"
> +#include "port/atomics.h"

I'd rather just sort these alphabetically.




I think this should rather be in the initial header.

> +/*
> + * autoprewarm :
> + *
> + * What is it?
> + * ===
> + * A bgworker which automatically records information about blocks which were
> + * present in buffer pool before server shutdown and then prewarm the buffer
> + * pool upon server restart with those blocks.
> + *
> + * How does it work?
> + * =
> + * When the shared library "pg_prewarm" is preloaded, a
> + * bgworker "autoprewarm" is launched immediately after the server has 
> reached
> + * consistent state. The bgworker will start loading blocks recorded in the
> + * format BlockInfoRecord
> + * <> in
> + * $PGDATA/AUTOPREWARM_FILE, until there is a free buffer left in the buffer
> + * pool. This way we do not replace any new blocks which were loaded either 
> by
> + * the recovery process or the querying clients.

s/until there is a/until there is no/?


> +/*
> + * 
> 
> + * ===SIGNAL HANDLERS
> ===
> + * 
> 
> + */

Hm...

> +static void sigtermHandler(SIGNAL_ARGS);
> +static void sighupHandler(SIGNAL_ARGS);

I don't think that's a casing we commonly use.  We mostly use CamelCase
or underscore_case.


> +/*
> + *   Signal handler for SIGUSR1.
> + */
> +static void
> +sigusr1Handler(SIGNAL_ARGS)
> +{
> + int save_errno = errno;
> +
> + if (MyProc)
> + SetLatch(>procLatch);
> +
> + errno = save_errno;
> +}

Hm, what's this one for?


> +/*
> + * Shared state information about the running autoprewarm bgworker.
> + */
> +typedef struct AutoPrewarmSharedState
> +{
> + pg_atomic_uint32 current_task;  /* current tasks performed by
> + 
>  * autoprewarm workers. */
> +} AutoPrewarmSharedState;

Hm.  Why do we need atomics here?  I thought there's no concurrency?


> +/*
> + * sort_cmp_func - compare function used for qsort().
> + */
> +static int
> +sort_cmp_func(const void *p, const void *q)
> +{

rename to blockinfo_cmp?



> +static AutoPrewarmTask
> +get_autoprewarm_task(AutoPrewarmTask todo_task)
> +{
> + boolfound;
> +
> + state = NULL;
> +
> + LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
> + state = ShmemInitStruct("autoprewarm",
> + 
> sizeof(AutoPrewarmSharedState),
> + );
> + if (!found)
> + pg_atomic_write_u32(&(state->current_task), todo_task);

Superflous parens (repeated a lot).


> + LWLockRelease(AddinShmemInitLock);
> +
> + /* If found check if we can go ahead. */
> + if (found)

Re: [HACKERS] multivariate statistics (v25)

2017-04-05 Thread David Rowley

On 6 April 2017 at 10:17, Simon Riggs  wrote:
> On 5 April 2017 at 10:47, David Rowley  wrote:
>
>> I've attached an updated patch to address Tomas' concerns and yours too.
>
> Commited, with some doc changes and additions based upon my explorations.

Great. Thanks for committing!


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v25)

2017-04-05 Thread Simon Riggs

On 5 April 2017 at 10:47, David Rowley  wrote:

> I've attached an updated patch to address Tomas' concerns and yours too.

Commited, with some doc changes and additions based upon my explorations.

For the record, I measured the time to calc extended statistics as
+800ms on 2 million row sample.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v25)

2017-04-05 Thread David Rowley

On 6 April 2017 at 07:19, Tels  wrote:
> I know I'm a bit late, but isn't the syntax backwards?
>
> "CREATE STATISTICS s1 WITH (dependencies) ON (col_a, col_b) FROM table;"
>
> These do it the other way round:
>
> CREATE INDEX idx ON table (col_a);
>
> AND:
>
>CREATE TABLE t (
>  id INT  REFERENCES table_2 (col_b);
>);
>
> Won't this be confusing and make things hard to remember?
>
> Sorry for not asking earlier, I somehow missed this.

The reasoning is in [1]

[1] 
https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=vbj78v...@mail.gmail.com


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-04-05 Thread Andres Freund

On 2017-04-05 17:22:34 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I'd like some input from other committers whether we want this.  I'm
> > somewhat doubtful, but don't have particularly strong feelings.
> 
> I don't really want to expose the workings of the plancache at user level.
> The heuristics it uses certainly need work, but it'll get hard to change
> those once there are SQL features depending on it.
> 
> Also, as you note, there are debatable design decisions in this particular
> patch.  There are already a couple of ways in which control knobs can be
> attached to plgsql functions (i.e. custom GUCs and the comp_option stuff),
> so why is this patch wanting to invent yet another fundamental mechanism?
> And I'm not very happy about it imposing a new reserved keyword, either.
> 
> A bigger-picture question is why we'd only provide such functionality
> in plpgsql, and not for other uses of prepared plans.
> 
> Lastly, it doesn't look to me like the test cases prove anything at all
> about whether the feature does what it's claimed to.

That echoes my perception - so let's move this to the next CF?  It's not
like this patch has been pending for very long.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] partitioned tables and contrib/sepgsql

2017-04-05 Thread Mike Palmiotto

On Wed, Apr 5, 2017 at 1:22 PM, Mike Palmiotto
 wrote:
> On Wed, Apr 5, 2017 at 1:19 PM, Tom Lane  wrote:
>> Peter Eisentraut  writes:
>>> On 4/5/17 12:04, Tom Lane wrote:
 Conclusion: Fedora's gcc is playing fast and loose somehow with the
 command "#include "; that does not include the file
 you'd think it does, it does something magic inside the compiler.
 The magic evidently includes not complaining about duplicate macro
 definitions for true and false.
>>
>>> Perhaps -Wsystem-headers would change the outcome in your case.
>>
>> Hah, you're right: with that,
>>
>> In file included from /usr/include/selinux/label.h:9:0,
>>  from label.c:40:
>> /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include/stdbool.h:34:0: warning: 
>> "true" redefined
>>  #define true 1
>>
>> In file included from ../../src/include/postgres.h:47:0,
>>  from label.c:11:
>> ../../src/include/c.h:206:0: note: this is the location of the previous 
>> definition
>>  #define true ((bool) 1)
>>
>> and likewise for "false".  So that mystery is explained.
>>
>> I stand by my previous patch suggestion, except that we can replace
>> the parenthetical comment with something like "(We don't care if
>>  redefines "true"/"false"; those are close enough.)".
>>
>
> Sounds good. Updated patchset will include that verbiage, along with
> some regression tests for partitioned tables.

Looks like the label regression test is failing even without these changes.

Weirdly enough, this only happens in one place:

 84 SELECT objtype, objname, label FROM pg_seclabels
 85 WHERE provider = 'selinux' AND objtype = 'column' AND (objname
like 't3.%' OR objname like 't4.%');

I haven't figured out why this may be (yet), but it seems like we
shouldn't assume the order of output from a SELECT. As I understand
it, the order of the output is subject to change with changes to the
planner/configuration.

I'm going to hold the partition table regression changes for a
separate patch and include some ORDER BY fixes. Will post tomorrow

In the meantime, attached are the latest and greatest patches.

Thanks,
-- 
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com
From 50969159f0fd7c93a15bfb7b9046512399d0b099 Mon Sep 17 00:00:00 2001
From: Mike Palmiotto 
Date: Wed, 29 Mar 2017 14:59:37 +
Subject: [PATCH 3/3] Add partitioned table support to sepgsql

Account for RELKIND_PARTITIONED_RELATIONS in sepgsql and treat the objects like
regular relations. This allows for proper object_access hook behavior for
partitioned tables.
---
 contrib/sepgsql/label.c|  3 ++-
 contrib/sepgsql/relation.c | 30 +-
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c
index f7d79ab..1a1ec57 100644
--- a/contrib/sepgsql/label.c
+++ b/contrib/sepgsql/label.c
@@ -789,7 +789,8 @@ exec_object_restorecon(struct selabel_handle * sehnd, Oid catalogId)
 			case RelationRelationId:
 relForm = (Form_pg_class) GETSTRUCT(tuple);
 
-if (relForm->relkind == RELKIND_RELATION)
+if (relForm->relkind == RELKIND_RELATION ||
+	relForm->relkind == RELKIND_PARTITIONED_TABLE)
 	objtype = SELABEL_DB_TABLE;
 else if (relForm->relkind == RELKIND_SEQUENCE)
 	objtype = SELABEL_DB_SEQUENCE;
diff --git a/contrib/sepgsql/relation.c b/contrib/sepgsql/relation.c
index b794abe..0d8905c 100644
--- a/contrib/sepgsql/relation.c
+++ b/contrib/sepgsql/relation.c
@@ -54,12 +54,14 @@ sepgsql_attribute_post_create(Oid relOid, AttrNumber attnum)
 	ObjectAddress object;
 	Form_pg_attribute attForm;
 	StringInfoData audit_name;
+	char		relkind;
 
 	/*
 	 * Only attributes within regular relation have individual security
 	 * labels.
 	 */
-	if (get_rel_relkind(relOid) != RELKIND_RELATION)
+	relkind = get_rel_relkind(relOid);
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		return;
 
 	/*
@@ -135,8 +137,10 @@ sepgsql_attribute_drop(Oid relOid, AttrNumber attnum)
 {
 	ObjectAddress object;
 	char	   *audit_name;
+	char		relkind;
 
-	if (get_rel_relkind(relOid) != RELKIND_RELATION)
+	relkind = get_rel_relkind(relOid);
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		return;
 
 	/*
@@ -167,8 +171,11 @@ sepgsql_attribute_relabel(Oid relOid, AttrNumber attnum,
 {
 	ObjectAddress object;
 	char	   *audit_name;
+	char		relkind;
 
-	if (get_rel_relkind(relOid) != RELKIND_RELATION)
+	relkind = get_rel_relkind(relOid);
+
+	if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("cannot set security label on non-regular columns")));
@@ -209,8 +216,11 @@ sepgsql_attribute_setattr(Oid relOid, AttrNumber attnum)
 {
 	ObjectAddress object;
 	char	   *audit_name;
+	char		relkind;
+
+	relkind =

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-04-05 Thread Tom Lane

Andres Freund  writes:
> I'd like some input from other committers whether we want this.  I'm
> somewhat doubtful, but don't have particularly strong feelings.

I don't really want to expose the workings of the plancache at user level.
The heuristics it uses certainly need work, but it'll get hard to change
those once there are SQL features depending on it.

Also, as you note, there are debatable design decisions in this particular
patch.  There are already a couple of ways in which control knobs can be
attached to plgsql functions (i.e. custom GUCs and the comp_option stuff),
so why is this patch wanting to invent yet another fundamental mechanism?
And I'm not very happy about it imposing a new reserved keyword, either.

A bigger-picture question is why we'd only provide such functionality
in plpgsql, and not for other uses of prepared plans.

Lastly, it doesn't look to me like the test cases prove anything at all
about whether the feature does what it's claimed to.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible encoding issues with libxml2 functions

2017-04-05 Thread Pavel Stehule

2017-03-17 4:23 GMT+01:00 Noah Misch :

> On Sun, Mar 12, 2017 at 10:26:33PM +0100, Pavel Stehule wrote:
> > 2017-03-12 21:57 GMT+01:00 Noah Misch :
> > > On Sun, Mar 12, 2017 at 08:36:58PM +0100, Pavel Stehule wrote:
> > > > 2017-03-12 0:56 GMT+01:00 Noah Misch :
> > > Please add a test case.
>

I am sending test case.

I am afraid so this cannot be part of regress tests due strong dependency
on locale win1250.

> >
> > It needs a application - currently there is not possibility to import XML
> > document via recv API :(
>
> I think xml_in() can create every value that xml_recv() can create;
> xml_recv()
> is just more convenient given diverse source encodings.  If you make your
> application store the value into a table, does "pg_dump --inserts" emit
> code
> that reproduces the same value?  If so, you can use that in your test case.
> If not, please provide precise instructions (code, SQL commands) for
> reproducing the bug manually.
>

I was wrong - both methods produces broken internal XML document

1
Alter Ego de Palmer
63
2012
0
0
0
0
1425
1085
0
0
51 % Merlot, 40 % Cabernet Sauvignon,9 %
Petit Verdot
2017 - 2026
2
0
0
0
0
0
13,4 %
Premiant oblasti Margaux Ch. Palmer
tentokrát ve svých obou vínech těžil z dokonale zralého Merlotu, kterého do
svých směsí naládoval více než Cabernet Sauvignonu. 2. víno Alter Ego de
Palmer 2012 se může p
ochlubit skvělou kondicí. Není pochyb o tom, že na Ch. Palmer sklízeli z
vinice dokonale zralé hrozny. Alter Ego je mohutné, husté a syté víno,
nabízí objemnou dávku ovoce, voní po ostružinách, chuť je krásně kulatá,
pevná a svěží, taniny
 vykazují fenomenální strukturu, takto krásné spojení všech aromatických a
chuťových složek s příměsí úžasných kyselin a alkoholu je radost mít ve
sklepě.
Robert Parker: /100
TOPVINO SCORE: 92-94/100
James Suckling: 92-93/100
Wine Spectator: 90-93/100

1

8
1

Document is well encoded from win1250 to UTF8 - it is well readable, but
the header is wrong - it is not in win1250 now (after encoding). This xml
is attached (in original form (win1250 encoding)).

Import with

create table target(x xml);
\set xml `cat ~/Downloads/e6.xml`
postgres=# set client_encoding to win1250;
postgres=# insert into target values(:'xml');

disconnect or reset client encoding

Document should be correctly inserted.

Then run a query

postgres=# select * from target, xpath('/enprimeur/vino', x);
ERROR:  could not parse XML document
DETAIL:  input conversion failed due to input error, bytes 0x88 0x73 0x6B
0x75
input conversion failed due to input error, bytes 0x88 0x73 0x6B 0x75
encoder error
line 25: Premature end of data in tag remark line 25
line 25: Premature end of data in tag vino line 3
line 25: Premature end of data in tag enprimeur line 2

select xmltable.* from target, xmltable('/enprimeur/vino' passing x columns
id int, nazev text, remark text);

XMLTABLE is not vulnerable against this bug and result should be correct.

when you do

select x from target

you can see a correct xml without header

but select x::text from target; shows wrong header

>
> > > Why not use xml_parse() instead of calling xmlCtxtReadMemory()
> directly?
> > > The
> > > answer is probably in the archives, because someone understood the
> problem
> > > enough to document "Some XML-related functions may not work at all on
> > > non-ASCII data when the server encoding is not UTF-8. This is known to
> be
> > > an
> > > issue for xpath() in particular."
> >
> >
> > Probably there are two possible issues
>
> Would you research in the archives to confirm?
>
> > 1. what I touched - recv function does encoding to database encoding -
> but
> > document encoding is not updated.
>
> Using xml_parse() would fix that, right?
>

It should to help, because it cuts a header - but it does little bit more
work, than we would - it check if xml is doc or not too.

In mostly functions there the important work is done in parse_xml_decl
function

One possible fix - and similar technique is used more times in xml.c

Re: [HACKERS] pg_stat_wal_write statistics view

2017-04-05 Thread Andres Freund

Hi,

On 2017-03-30 13:10:41 +1100, Haribabu Kommi wrote:
> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index 5d58f09..a29c108 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -600,6 +600,9 @@ typedef struct XLogCtlData
>*/
>   XLogwrtResult LogwrtResult;
>  
> + /* Protected by WALWriteLock */
> + PgStat_WalWritesCounts stats;

>  /*
> + * Check whether the current process is a normal backend or not.
> + * This function checks for the background processes that does
> + * some WAL write activity only and other background processes
> + * are not considered. It considers all the background workers
> + * as WAL write activity workers.
> + *
> + * Returns FALSE - when the current process is a normal backend
> + *  TRUE - when the current process a background process/worker
> + */

I don't think we commonly capitalize true/false these days.

> +static bool
> +am_background_process()
> +{
> + /* check whether current process is a background process/worker? */
> + if (AmBackgroundWriterProcess() ||
> + AmWalWriterProcess() ||
> + AmCheckpointerProcess() ||
> + AmStartupProcess() ||
> + IsBackgroundWorker ||
> + am_walsender ||
> + am_autovacuum_worker)
> + {
> + return true;
> + }
> +
> + return false;
> +}

Uhm, wouldn't inverting this be a lot easier and future proof?  There's
far fewer non-background processes.

> +/*
>   * Write and/or fsync the log at least as far as WriteRqst indicates.
>   *
>   * If flexible == TRUE, we don't have to write as far as WriteRqst, but
> @@ -2341,6 +2377,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>   int npages;
>   int startidx;
>   uint32  startoffset;
> + instr_time  io_start,
> + io_time;
> + boolis_background_process = am_background_process();
>  
>   /* We should always be inside a critical section here */
>   Assert(CritSectionCount > 0);
> @@ -2458,6 +2497,11 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>   /* OK to write the page(s) */
>   from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
>   nbytes = npages * (Size) XLOG_BLCKSZ;
> +
> + /* Start timer to acquire start time of the wal write */
> + if (track_io_timing)
> + INSTR_TIME_SET_CURRENT(io_start);
> +

I'm more than a bit hesitant adding overhead to WAL writing - it's
already quite a bit of a bottleneck.  Normally track_io_timing just
makes things a tiny bit slower, but doesn't cause a scalability issue,
here it does.  This is all done under an already highly contended lock.


>   nleft = nbytes;
>   do
>   {
> @@ -2480,6 +2524,34 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>   from += written;
>   } while (nleft > 0);
>  
> + /* calculate the total time spent for wal writing */
> + if (track_io_timing)
> + {
> + INSTR_TIME_SET_CURRENT(io_time);
> + INSTR_TIME_SUBTRACT(io_time, io_start);
> +
> + if (is_background_process)
> + XLogCtl->stats.total_write_time += 
> INSTR_TIME_GET_MILLISEC(io_time);
> + else
> + XLogCtl->stats.backend_total_write_time 
> += INSTR_TIME_GET_MILLISEC(io_time);
> + }
> + else
> + {
> + XLogCtl->stats.total_write_time = 0;
> + XLogCtl->stats.backend_total_write_time = 0;
> + }
> +
> + /* check whether writer is a normal backend or not? */
> + if (is_background_process)
> + XLogCtl->stats.writes++;
> + else
> + XLogCtl->stats.backend_writes++;
> +
> + if (is_background_process)
> + XLogCtl->stats.write_blocks += npages;
> + else
> + XLogCtl->stats.backend_write_blocks += npages;
> +
>   /* Update state for write */
>   openLogOff += nbytes;
>   npages = 0;
> @@ -2499,8 +2571,29 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>*/
>   if (finishing_seg)
>   {
> + /* Start timer to acquire start time of the wal 
> sync */
> + if

Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan

2017-04-05 Thread Andres Freund

Hi,


I'd like some input from other committers whether we want this.  I'm
somewhat doubtful, but don't have particularly strong feelings.


> +
> +  
> +   Block level PRAGMA
> +
> +   
> +PRAGMA
> +in PL/pgSQL
> +   
> +
> +   
> +The block level PRAGMA allows to change the
> +PL/pgSQL compiler behavior. Currently
> +only PRAGMA PLAN_CACHE is supported.

Why are we doing this on a block level?


> +
> +CREATE FUNCTION enforce_fresh_plan(_id text) RETURNS boolean AS $$
> +DECLARE
> +  PRAGMA PLAN_CACHE(force_custom_plan);
> +BEGIN
> +  -- in this block every embedded query uses one shot plan

*plans


> +
> + PRAGMA PLAN_CACHE
> +
> + 
> +  The plan cache behavior can be controlled using
> +  PRAGMA PLAN_CACHE. This PRAGMA can be used both
> +  for whole function or in individual blocks. The following options are

*functions


> +  possible: DEFAULT - default
> +  PL/pgSQL implementation - the system tries
> +  to decide between custom plan and generic plan after five query
> +  executions, FORCE_CUSTOM_PLAN - the chosen execution
> +  plan will be the one shot plan - it is specific for every set of
> +  used paramaters, FORCE_GENERIC_PLAN - the generic
> +  plan will be used from the start.

I don't think it's a good idea to explain this here, this'll just get
outdated.  I think we should rather have a link here.


> + 
> +
> + 
> +  
> +   PRAGMA PLAN_CACHE
> +   in PL/pgSQL
> +  
> +  The plan for INSERT is always a generic
> plan.

That's this specific insert, right? Should be mentioned, sounds more
generic to me.

> +/* --
> + * Returns pointer to current compiler settings
> + * --
> + */
> +PLpgSQL_settings *
> +plpgsql_current_settings(void)
> +{
> + return current_settings;
> +}
> +
> +
> +/* --
> + * Setup default compiler settings
> + * --
> + */
> +void
> +plpgsql_settings_init(PLpgSQL_settings *settings)
> +{
> + current_settings = settings;
> +}

Hm. This is only ever set to _settings.


> +/* --
> + * Set compiler settings
> + * --
> + */
> +void
> +plpgsql_settings_set(PLpgSQL_settings *settings)
> +{
> + PLpgSQL_nsitem *ns_cur = ns_top;
> +
> + /*
> +  * Modify settings directly, when ns has local settings data.
> +  * When ns uses shared settings, create settings first.
> +  */
> + while (ns_cur->itemtype != PLPGSQL_NSTYPE_LABEL)
> + ns_cur = ns_cur->prev;
> +
> + if (ns_cur->local_settings == NULL)
> + {
> + ns_cur->local_settings = palloc(sizeof(PLpgSQL_settings));
> + ns_cur->local_settings->prev = current_settings;
> + current_settings = ns_cur->local_settings;
> + }
> +
> + current_settings->cursor_options = settings->cursor_options;
> +}

This seems like a somewhat weird method.  Why do we have a global
settings, when we essentially just want to use something in the current
ns?



- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG_GETARG_GISTENTRY?

2017-04-05 Thread Mark Dilger


> On Apr 5, 2017, at 1:12 PM, Tom Lane  wrote:
> 
> Mark Dilger  writes:
>> I have written a patch to fix these macro definitions across src/ and 
>> contrib/.
>> Find the patch, attached.  All regression tests pass on my Mac laptop.
> 
> Thanks for doing the legwork on that.  

You are welcome.

> This seems a bit late for v10,
> especially since it's only cosmetic

Agreed.

> , but please put it in the first
> v11 commitfest.

Done.

> 
>> I don't find any inappropriate uses of _P where _PP would be called for.  I 
>> do,
>> however, notice that some datatypes' functions are written to use 
>> PG_GETARG_*_P
>> where PG_GETARG_*_PP might be more efficient.
> 
> Yeah.  I think Noah did some work in that direction already, but I don't
> believe he claimed to have caught everything.  Feel free to push further.

Thanks for clarifying.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG_GETARG_GISTENTRY?

2017-04-05 Thread Tom Lane

Mark Dilger  writes:
> I have written a patch to fix these macro definitions across src/ and 
> contrib/.
> Find the patch, attached.  All regression tests pass on my Mac laptop.

Thanks for doing the legwork on that.  This seems a bit late for v10,
especially since it's only cosmetic, but please put it in the first
v11 commitfest.

> I don't find any inappropriate uses of _P where _PP would be called for.  I 
> do,
> however, notice that some datatypes' functions are written to use 
> PG_GETARG_*_P
> where PG_GETARG_*_PP might be more efficient.

Yeah.  I think Noah did some work in that direction already, but I don't
believe he claimed to have caught everything.  Feel free to push further.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG_GETARG_GISTENTRY?

2017-04-05 Thread Mark Dilger

> On Apr 5, 2017, at 9:23 AM, Tom Lane  wrote:
> 
> Robert Haas  writes:
>> On Wed, Mar 29, 2017 at 11:54 PM, Tom Lane  wrote:
>>> Andres Freund  writes:
 we have a good number of '(GISTENTRY *) PG_GETARG_POINTER(n)' in our
 code - looks a bit better & shorter to have PG_GETARG_GISTENTRY(n).
> 
>>> Should be PG_GETARG_GISTENTRY_P to match existing conventions,
>>> otherwise +1
> 
>> I have never quite understood why some of those macros have _P or _PP
>> on the end and others don't.
> 
> _P means "pointer to".  _PP was introduced later to mean "pointer to
> packed (ie, possibly short-header) datum".  Macros that mean to fetch
> pointers to pass-by-ref data, but aren't using either of those naming
> conventions, are violating project conventions, not least because you
> don't know what they're supposed to do with short-header varlena input.
> If I had a bit more spare time I'd run around and change any such macros.
> 
> In short, if you are supposed to write
> 
>   FOO  *val = PG_GETARG_FOO(n);
> 
> then the macro designer blew it, because the name implies that it
> returns FOO, not pointer to FOO.  This should be
> 
>   FOO  *val = PG_GETARG_FOO_P(n);
> 
>   regards, tom lane

I have written a patch to fix these macro definitions across src/ and contrib/.
Find the patch, attached.  All regression tests pass on my Mac laptop.

I don't find any inappropriate uses of _P where _PP would be called for.  I do,
however, notice that some datatypes' functions are written to use PG_GETARG_*_P
where PG_GETARG_*_PP might be more efficient.  Varbit's bitoctetlength function
could detoast only the header ala PG_DETOAST_DATUM_SLICE to return the
octet length, rather than detoasting the whole thing.  But that seems a 
different
issue, and patches to change that might have been rejected in the past so far 
as I
know, so I did not attempt any such changes here.

Mark Dilger

PG_GETARG_foo_P.patch.1
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increasing the default WAL segment size

2017-04-05 Thread Simon Riggs

On 5 April 2017 at 08:36, Peter Eisentraut
 wrote:
> On 4/5/17 06:04, Beena Emerson wrote:
>> I suggest the next step is to dial up the allowed segment size in
>> configure and run some tests about what a reasonable maximum value could
>> be.  I did a little bit of that, but somewhere around 256 MB, things got
>> really slow.
>>
>>
>> Would it be better if just increase the limit to 128MB for now?
>> In next we can change the WAL file name format and expand the range?
>
> I don't think me saying it felt a bit slow around 256 MB is a proper
> technical analysis that should lead to the conclusion that that upper
> limit should be 128 MB. ;-)
>
> This tells me that there is a lot of explore and test here before we
> should let it loose on users.

Agreed

> I think the best we should do now is spend a bit of time exploring
> whether/how larger values of segment size behave, and bump the hardcoded
> configure limit if we get positive results.  Everything else should
> probably be postponed.
>
> (Roughly speaking, to get started, this would mean compiling with
> --with-wal-segsize 16, 32, 64, 128, 256, run make check-world both
> sequentially and in parallel, and take note of a) passing, b) run time,
> c) disk space.)

I've committed the rest of Beena's patch to allow this testing to
occur up to 1024MB.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-05 Thread Robert Haas

On Wed, Apr 5, 2017 at 2:32 PM, Pavan Deolasee  wrote:
>> The only other idea that I have for a really clean solution here is to
>> support this only for index types that are amcanreturn, and actually
>> compare the value stored in the index tuple with the one stored in the
>> heap tuple, ensuring that new index tuples are inserted whenever they
>> don't match and then using the exact same test to determine the
>> applicability of a given index pointer to a given heap tuple.
>
> Just so that I understand, are you suggesting that while inserting WARM
> index pointers, we check if the new index tuple will look exactly the same
> as the old index tuple and not insert a duplicate pointer at all?

Yes.

> I considered that, but it will require us to do an index lookup during WARM
> index insert and for non-unique keys, that may or may not be exactly cheap.

I don't think it requires that.  You should be able to figure out
based on the tuple being updated and the corresponding new tuple
whether this will bet true or not.

> Or we need something like what Claudio wrote to sort all index entries by
> heap TIDs. If we do that, then the recheck can be done just based on the
> index and heap flags (because we can then turn the old index pointer into a
> CLEAR pointer. Index pointer is set to COMMON during initial insert).

Yeah, I think that patch is going to be needed for some of the storage
work I'm interesting in doing, too, so I am tentatively in favor of
it, but I wasn't proposing using it here.

> The other way is to pass old tuple values along with the new tuple values to
> amwarminsert, build index tuples and then do a comparison. For duplicate
> index tuples, skip WARM inserts.

This is more what I was thinking.  But maybe one of the other ideas
you wrote here is better; not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.

2017-04-05 Thread Robert Haas

On Thu, Mar 23, 2017 at 1:54 PM, Ashutosh Sharma  wrote:
>> Yeah, but I think "unused" might be better.  Because a page could be
>> in use (as an overflow page or primary bucket page) and still be
>> empty.
>
> Based on the earlier discussions, I have prepared a patch that would
> allow pgstathashindex() to show the number of unused pages in hash
> index. Please find the attached patch for the same. Thanks.

My idea is that we shouldn't end up with both a zero_pages column and
an unused_pages column.  Instead, we should end up with just an
unused_pages column, which will include both pages that are all-zeroes
and pages that have a valid special space marked as LH_UNUSED.

Also, I don't see why it's correct to test PageIsEmpty() here instead
of just testing the page type as we do in pageinspect.

Attached is a revised patch that shows what I have in mind; please
review.  Along the way I made the code for examining the page type
more similar to what pageinspect does, because I see no reason for
those things to be different, and I think the pageinspect code is
better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgstathashindex-empty.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v25)

2017-04-05 Thread Tels

Moin,

On Wed, April 5, 2017 2:52 pm, Simon Riggs wrote:
> On 5 April 2017 at 10:47, David Rowley 
> wrote:
>
>>> I have some other comments.
>
> Me too.
>
>
> CREATE STATISTICS should take ShareUpdateExclusiveLock like ANALYZE.
>
> This change is in line with other changes in this and earlier
> releases. Comments and docs included.
>
> Patch ready to be applied directly barring objections.

I know I'm a bit late, but isn't the syntax backwards?

"CREATE STATISTICS s1 WITH (dependencies) ON (col_a, col_b) FROM table;"

These do it the other way round:

CREATE INDEX idx ON table (col_a);

AND:

   CREATE TABLE t (
 id INT  REFERENCES table_2 (col_b);
   );

Won't this be confusing and make things hard to remember?

Sorry for not asking earlier, I somehow missed this.

Regard,

Tels


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

2017-04-05 Thread Andres Freund

Hi,


I'm somewhat inclined to think that this really would be better done in
an extension like pg_stat_statements.


On 2017-03-08 14:39:23 +1100, Haribabu Kommi wrote:
> On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier 

> + 
> +  track_sql (boolean)
> +  
> +   track_sql configuration parameter
> +  
> +  
> +  
> +   
> +Enables collection of different SQL statement statistics that are
> +executed on the instance. This parameter is off by default. Only
> +superusers can change this setting.
> +   
> +  
> + 
> +

I don't like this name much, seems a bit too generic to
me. 'track_activities', 'track_io_timings' are at least a bit clearer.
How about track_statement_statistics + corresponding view/function renaming?


> +  
> +   pg_stat_sql View
> +   
> +
> +
> +  Column
> +  Type
> +  Description
> + 
> +
> +
> +   
> +
> + tag
> + text
> + Name of the SQL statement
> +

It's not really the "name" of a statement. Internally and IIRC elsewhere
in the docs we describe something like this as tag?


> +/* --
> + * pgstat_send_sqlstmt(void)
> + *
> + *   Send SQL statement statistics to the collector
> + * --
> + */
> +static void
> +pgstat_send_sqlstmt(void)
> +{
> + PgStat_MsgSqlstmt msg;
> + PgStat_SqlstmtEntry *entry;
> + HASH_SEQ_STATUS hstat;
> +
> + if (pgstat_backend_sql == NULL)
> + return;
> +
> + pgstat_setheader(_hdr, PGSTAT_MTYPE_SQLSTMT);
> + msg.m_nentries = 0;
> +
> + hash_seq_init(, pgstat_backend_sql);
> + while ((entry = (PgStat_SqlstmtEntry *) hash_seq_search()) != 
> NULL)
> + {
> + PgStat_SqlstmtEntry *m_ent;
> +
> + /* Skip it if no counts accumulated since last time */
> + if (entry->count == 0)
> + continue;
> +
> + /* need to convert format of time accumulators */
> + m_ent = _entry[msg.m_nentries];
> + memcpy(m_ent, entry, sizeof(PgStat_SqlstmtEntry));
> +
> + if (++msg.m_nentries >= PGSTAT_NUM_SQLSTMTS)
> + {
> + pgstat_send(, offsetof(PgStat_MsgSqlstmt, 
> m_entry[0]) +
> + msg.m_nentries * 
> sizeof(PgStat_SqlstmtEntry));
> + msg.m_nentries = 0;
> + }
> +
> + /* reset the entry's count */
> + entry->count = 0;
> + }
> +
> + if (msg.m_nentries > 0)
> + pgstat_send(, offsetof(PgStat_MsgSqlstmt, m_entry[0]) +
> + msg.m_nentries * 
> sizeof(PgStat_SqlstmtEntry));

Hm. So pgstat_backend_sql is never deleted, which'll mean that if a
backend lives long we'll continually iterate over a lot of 0 entries.  I
think performance evaluation of this feature should take that into
account.


> +}
> +
> +/*
> + * Count SQL statement for pg_stat_sql view
> + */
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + boolfound;
> +
> + if (!pgstat_backend_sql)
> + {
> + /* First time through - initialize SQL statement stat table */
> + HASHCTL hash_ctl;
> +
> + memset(_ctl, 0, sizeof(hash_ctl));
> + hash_ctl.keysize = NAMEDATALEN;
> + hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry);
> + pgstat_backend_sql = hash_create("SQL statement stat entries",
> + 
>  PGSTAT_SQLSTMT_HASH_SIZE,
> + 
>  _ctl,
> + 
>  HASH_ELEM | HASH_BLOBS);
> + }
> +
> + /* Get the stats entry for this SQL statement, create if necessary */
> + htabent = hash_search(pgstat_backend_sql, commandTag,
> +   HASH_ENTER, );
> + if (!found)
> + htabent->count = 1;
> + else
> + htabent->count++;
> +}


That's not really something in this patch, but a lot of this would be
better if we didn't internally have command tags as strings, but as an
enum.  We should really only convert to a string with needed.  That
we're doing string comparisons on Portal->commandTag is just plain bad.

If so, we could also make this whole patch a lot cheaper - have a fixed
size array that has an entry for every possible tag (possibly even in
shared memory, and then use atomics there).

> +++ b/src/backend/tcop/postgres.c
> @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string)
>  
>   PortalDrop(portal, false);
>  
> + /*
> +  * Count SQL statement for pg_stat_sql view
> +  */
> + if (pgstat_track_sql)
> +

Re: [HACKERS] multivariate statistics (v25)

2017-04-05 Thread Simon Riggs

On 5 April 2017 at 10:47, David Rowley  wrote:

>> I have some other comments.

Me too.


CREATE STATISTICS should take ShareUpdateExclusiveLock like ANALYZE.

This change is in line with other changes in this and earlier
releases. Comments and docs included.

Patch ready to be applied directly barring objections.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


create_statistics_lock_reduction.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-05 Thread Keith Fiske

On Wed, Apr 5, 2017 at 11:19 AM, Robert Haas  wrote:

> On Wed, Apr 5, 2017 at 5:57 AM, Rahila Syed 
> wrote:
> >>Could you briefly elaborate why you think the lack global index support
> >>would be a problem in this regard?
> > I think following can happen if we allow rows satisfying the new
> partition
> > to lie around in the
> > default partition until background process moves it.
> > Consider a scenario where partition key is a primary key and the data in
> the
> > default partition is
> > not yet moved into the newly added partition. If now, new data is added
> into
> > the new partition
> > which also exists(same key) in default partition there will be data
> > duplication. If now
> > we scan the partitioned table for that key(from both the default and new
> > partition as we
> > have not moved the rows) it will fetch the both rows.
> > Unless we have global indexes for partitioned tables, there is chance of
> > data duplication between
> > child table added after default partition and the default partition.
>
> Yes, I think it would be completely crazy to try to migrate the data
> in the background:
>
> - The migration might never complete because of a UNIQUE or CHECK
> constraint on the partition to which rows are being migrated.
>
> - Even if the migration eventually succeeded, such a design abandons
> all hope of making INSERT .. ON CONFLICT DO NOTHING work sensibly
> while the migration is in progress, unless the new partition has no
> UNIQUE constraints.
>
> - Partition-wise join and partition-wise aggregate would need to have
> special case handling for the case of an unfinished migration, as
> would any user code that accesses partitions directly.
>
> - More generally, I think users expect that when a DDL command
> finishes execution, it's done all of the work that there is to do (or
> at the very least, that any remaining work has no user-visible
> consequences, which would not be the case here).
>
> IMV, the question of whether we have efficient ways to move data
> around between partitions is somewhat separate from the question of
> whether partitions can be defined in a certain way in the first place.
> The problems that Keith refers to upthread already exist for
> subpartitioning; you've got to detach the old partition, create a new
> one, and then reinsert the data.  And for partitioning an
> unpartitioned table: create a replacement table, insert all the data,
> substitute it for the original table.  The fact that we have these
> limitation is not good, but we're not going to rip out partitioning
> entirely because we don't have clever ways of migrating the data in
> those cases, and the proposed behavior here is not any worse.
>
> Also, waiting for those problems to get fixed might be waiting for
> Godot.  I'm not really all that sanguine about our chances of coming
> up with a really nice way of handling these cases.  In a designed
> based on table inheritance, you can leave it murky where certain data
> is supposed to end up and migrate it on-line and you might get away
> with that, but a major point of having declarative partitioning at all
> is to remove that sort of murkiness.  It's probably not that hard to
> come up with a command that locks the parent and moves data around via
> full table scans, but I'm not sure how far that really gets us; you
> could do the same thing easily enough with a sequence of commands
> generated via a script.  And being able to do this in a general way
> without a full table lock looks pretty hard - it doesn't seem
> fundamentally different from trying to perform a table-rewriting
> operation like CLUSTER without a full table lock, which we also don't
> support.  The executor is not built to cope with any aspect of the
> table definition shifting under it, and that includes the set of child
> tables with are partitions of the table mentioned in the query.  Maybe
> the executor can be taught to survive such definitional changes at
> least in limited cases, but that's a much different project than
> allowing default partitions.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Confirmed that v5 patch works with examples given in the original post but
segfaulted when I tried the examples I used in my blog post (taken from the
documentation at the time I wrote it).
https://www.keithf4.com/postgresql-10-built-in-partitioning/

keith@keith=# drop table cities;
DROP TABLE
Time: 6.055 ms
keith@keith=# CREATE TABLE cities (
city_id bigserial not null,
name text not null,
population   int
) PARTITION BY LIST (initcap(name));
CREATE TABLE
Time: 7.130 ms
keith@keith=# CREATE TABLE cities_west
PARTITION OF cities (
CONSTRAINT city_id_nonzero CHECK (city_id != 0)
) FOR VALUES IN ('Los Angeles', 'San Francisco');
CREATE TABLE
Time: 6.690 ms
keith@keith=# CREATE TABLE cities_default
keith-# PARTITION OF cities FOR VALUES IN

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-05 Thread Peter Geoghegan

On Wed, Apr 5, 2017 at 11:27 AM, Andres Freund  wrote:
> I propose we move this patch to the next CF.

I agree. I think it's too late to be working out fine details around
TOAST like this. This is a patch that touches the storage format in a
fairly fundamental way.

The idea of turning WARM on or off reminds me a little bit of the way
it was at one time suggested that HOT not be used against catalog
tables, a position that Tom pushed against. I'm not saying that it's
necessarily a bad idea, but we should exhaust alternatives, and have a
clear rationale for it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

2017-04-05 Thread Fabien COELHO



Hello Tatsuo-san,


It seems the new feature \gset doesn't work with tables having none
ascii column names:


Indeed. The same error is triggered with the \set syntax, which does not 
involve any query execution.


I have added a sentence mentionning the restriction when variables are 
first discussed in the documentation, see attached patch.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 1eee8dc..c40f73e 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -773,6 +773,7 @@ pgbench  options  dbname
There is a simple variable-substitution facility for script files.
Variables can be set by the command-line -D option,
explained above, or by the meta commands explained below.
+   Variable names are restricted to ASCII strings.
In addition to any variables preset by -D command-line options,
there are a few variables that are preset automatically, listed in
. A value specified for these
@@ -816,6 +817,51 @@ pgbench  options  dbname
   
 
   
+   
+
+ \gcset [prefix] or
+ \gset [prefix]
+
+
+
+ 
+  These commands may be used to end SQL queries, replacing a semicolon.
+  \gcset replaces an embedded semicolon (\;) within
+  a compound SQL command, and \gset replaces a final
+  (;) semicolon which ends the SQL command. 
+ 
+
+ 
+  When these commands are used, the preceding SQL query is expected to
+  return one row, the columns of which are stored into variables named after
+  column names, and prefixed with prefix if provided.
+ 
+
+ 
+  The following example puts the final account balance from the first query
+  into variable abalance, and fills variables
+  one, two and p_three with
+  integers from a compound query.
+
+UPDATE pgbench_accounts
+  SET abalance = abalance + :delta
+  WHERE aid = :aid
+  RETURNING abalance \gset
+-- compound of two queries
+SELECT 1 AS one, 2 AS two \gcset
+SELECT 3 AS three \gset p_
+
+ 
+
+ 
+  
+\gcset and \gset commands do not work when
+empty SQL queries appear within a compound SQL command.
+  
+ 
+
+   
+

 
  \set varname expression
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 78f1e6b..ac9289b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -375,11 +375,14 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
 typedef struct
 {
-	char	   *line;			/* text of command line */
+	char	   *line;			/* first line for short display */
+	char	   *lines;			/* full multi-line text of command */
 	int			command_num;	/* unique index of this Command struct */
 	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
 	int			argc;			/* number of command words */
 	char	   *argv[MAX_ARGS]; /* command word list */
+	int			compound;   /* last compound command (number of \;) */
+	char	  **gset;   /* per-compound command prefix */
 	PgBenchExpr *expr;			/* parsed expression, if needed */
 	SimpleStats stats;			/* time spent in this command */
 } Command;
@@ -1225,6 +1228,105 @@ getQueryParams(CState *st, const Command *command, const char **params)
 		params[i] = getVariable(st, command->argv[i + 1]);
 }
 
+/* read all responses from backend */
+static bool
+read_response(CState *st, char **gset)
+{
+	PGresult   *res;
+	int			compound = -1;
+
+	while ((res = PQgetResult(st->con)) != NULL)
+	{
+		compound += 1;
+
+		switch (PQresultStatus(res))
+		{
+			case PGRES_COMMAND_OK: /* non-SELECT commands */
+			case PGRES_EMPTY_QUERY: /* may be used for testing no-op overhead */
+if (gset[compound] != NULL)
+{
+	fprintf(stderr,
+			"client %d file %d command %d compound %d: "
+			"\\gset expects a row\n",
+			st->id, st->use_file, st->command, compound);
+	st->ecnt++;
+	return false;
+}
+break; /* OK */
+
+			case PGRES_TUPLES_OK:
+if (gset[compound] != NULL)
+{
+	/* store result into variables */
+	int ntuples = PQntuples(res),
+		nfields = PQnfields(res),
+		f;
+
+	if (ntuples != 1)
+	{
+		fprintf(stderr,
+"client %d file %d command %d compound %d: "
+"expecting one row, got %d\n",
+st->id, st->use_file, st->command, compound, ntuples);
+		st->ecnt++;
+		PQclear(res);
+		discard_response(st);
+		return false;
+	}
+
+	for (f = 0; f < nfields ; f++)
+	{
+		char *varname = PQfname(res, f);
+		if (*gset[compound] != '\0')
+			varname = psprintf("%s%s", gset[compound], varname);
+
+		/* store result as a string */
+		if (!putVariable(st, "gset", varname,
+		 PQgetvalue(res, 0, f)))
+		{
+			/* internal error, should it rather abort? */
+			fprintf(stderr,
+	"client %d file %d command %d compound %d: "
+	"error storing into var %s\n",
+	st->id, st->use_file, st->command, compound,
+

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-05 Thread Andres Freund

On 2017-04-05 09:36:47 -0400, Robert Haas wrote:
> By the way, the "Converting WARM chains back to HOT chains" section of
> README.WARM seems to be out of date.  Any chance you could update that
> to reflect the current state and thinking of the patch?

I propose we move this patch to the next CF.  That shouldn't prevent you
working on it, although focusing on review of patches that still might
make it wouldn't hurt either.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-05 Thread Robert Haas

On Wed, Apr 5, 2017 at 2:18 PM, Fabien COELHO  wrote:
>>> [Action required within three days.  This is a generic notification.]
>>
>>
>> I will work on this next week.  I believe I will be able to provide a
>> patch for the web site CSS by April 12, but ultimate success will likely
>> depend on the collaboration of the web team.
>
>
> Just to point out that as it has been decided to fix the web site CSS, there
> is no expected change for pg 10 source base, so ISTM that the item can be
> closed as far as pg 10 is concerned.

Well, the documentation doesn't look right, currently, so I think that
should be fixed before we close this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-04-05 Thread Robert Haas

On Fri, Mar 24, 2017 at 3:44 PM, Ashutosh Sharma  wrote:
> Agreed. Moreover, previous approach might even change the current
> behaviour of functions like hash_page_items() and hash_page_stats()
> basically when we pass UNUSED pages to these functions. Attached is
> the newer version of patch.

This patch doesn't completely solve the problem:

pgbench -i -s 40
alter table pgbench_accounts drop constraint pgbench_accounts_pkey;
create index on pgbench_accounts using hash (aid);
insert into pgbench_accounts select * from pgbench_accounts;
select hash_page_type, min(n), max(n), count(*) from (select n,
hash_page_type(get_raw_page('pgbench_accounts_aid_idx'::text, n)) from
generate_series(0,
(pg_relation_size('pgbench_accounts_aid_idx')/8192)::integer - 1) n) x
group by 1;
ERROR:  index table contains zero page

This happens because of the sparse allocation forced by
_hash_alloc_buckets.  Perhaps the real fix is to have that function
initialize all of the pages instead of just the last one, but unless
and until we decide to do that, we've got to cope with zero pages in
the index.  Actually, even if we did fix that I think we'd still need
to do this, because the way relation extension works in general is
that we first write a page of zeroes and then go back and fill in the
data; an intervening crash can leave behind the intermediate state.

A second problem is that the one page initialized by
_hash_alloc_buckets needs to end up with a valid special space;
otherwise, you get the same error Jeff complained about originally
when you try to use hash_page_type() on it.

I fixed those issues and committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-05 Thread Fabien COELHO




(https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)


[Action required within three days.  This is a generic notification.]


I will work on this next week.  I believe I will be able to provide a
patch for the web site CSS by April 12, but ultimate success will likely
depend on the collaboration of the web team.


Just to point out that as it has been decided to fix the web site CSS, 
there is no expected change for pg 10 source base, so ISTM that the item 
can be closed as far as pg 10 is concerned.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Use American English spelling in pg_waldump error message

2017-04-05 Thread Peter Eisentraut

On 3/29/17 07:10, Daniel Gustafsson wrote:
> We use “unrecognize” rather than “unrecognise” in all other error messages in
> the tree, the attached patch fixes the one place where the latter spelling was
> used.

Committed, and I fixed a couple of code comments similarly.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-05 Thread Stephen Frost

Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 4/5/17 02:56, Noah Misch wrote:
> > On Thu, Mar 23, 2017 at 11:21:39PM -0400, Peter Eisentraut wrote:
> >> I think the fix belongs into the web site CSS, so there is nothing to
> >> commit into PostgreSQL here.  I will close the commit fest entry, but I
> >> have added a section to the open items list so we keep track of it.
> >> (https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)
> > 
> > [Action required within three days.  This is a generic notification.]
> 
> I will work on this next week.  I believe I will be able to provide a
> patch for the web site CSS by April 12, but ultimate success will likely
> depend on the collaboration of the web team.

I'm happy to try and help with this.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build

2017-04-05 Thread Peter Eisentraut

On 4/5/17 02:56, Noah Misch wrote:
> On Thu, Mar 23, 2017 at 11:21:39PM -0400, Peter Eisentraut wrote:
>> I think the fix belongs into the web site CSS, so there is nothing to
>> commit into PostgreSQL here.  I will close the commit fest entry, but I
>> have added a section to the open items list so we keep track of it.
>> (https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain)
> 
> [Action required within three days.  This is a generic notification.]

I will work on this next week.  I believe I will be able to provide a
patch for the web site CSS by April 12, but ultimate success will likely
depend on the collaboration of the web team.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] partitioned tables and contrib/sepgsql

2017-04-05 Thread Mike Palmiotto

On Wed, Apr 5, 2017 at 1:19 PM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> On 4/5/17 12:04, Tom Lane wrote:
>>> Conclusion: Fedora's gcc is playing fast and loose somehow with the
>>> command "#include "; that does not include the file
>>> you'd think it does, it does something magic inside the compiler.
>>> The magic evidently includes not complaining about duplicate macro
>>> definitions for true and false.
>
>> Perhaps -Wsystem-headers would change the outcome in your case.
>
> Hah, you're right: with that,
>
> In file included from /usr/include/selinux/label.h:9:0,
>  from label.c:40:
> /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include/stdbool.h:34:0: warning: 
> "true" redefined
>  #define true 1
>
> In file included from ../../src/include/postgres.h:47:0,
>  from label.c:11:
> ../../src/include/c.h:206:0: note: this is the location of the previous 
> definition
>  #define true ((bool) 1)
>
> and likewise for "false".  So that mystery is explained.
>
> I stand by my previous patch suggestion, except that we can replace
> the parenthetical comment with something like "(We don't care if
>  redefines "true"/"false"; those are close enough.)".
>

Sounds good. Updated patchset will include that verbiage, along with
some regression tests for partitioned tables.

Thanks,

-- 
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and SIGHUP

2017-04-05 Thread Peter Geoghegan

On Wed, Apr 5, 2017 at 10:21 AM, Fujii Masao  wrote:
> Both launcher and worker don't handle SIGHUP signal and cannot
> reload the configuration. I think that this is a bug.

+1


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical replication and SIGHUP

2017-04-05 Thread Fujii Masao

Hi,

Both launcher and worker don't handle SIGHUP signal and cannot
reload the configuration. I think that this is a bug. Will add this as
an open item barring objection.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] partitioned tables and contrib/sepgsql

2017-04-05 Thread Tom Lane

Peter Eisentraut  writes:
> On 4/5/17 12:04, Tom Lane wrote:
>> Conclusion: Fedora's gcc is playing fast and loose somehow with the
>> command "#include "; that does not include the file
>> you'd think it does, it does something magic inside the compiler.
>> The magic evidently includes not complaining about duplicate macro
>> definitions for true and false.

> Perhaps -Wsystem-headers would change the outcome in your case.

Hah, you're right: with that,

In file included from /usr/include/selinux/label.h:9:0,
 from label.c:40:
/usr/lib/gcc/x86_64-redhat-linux/6.3.1/include/stdbool.h:34:0: warning: "true" 
redefined
 #define true 1
 
In file included from ../../src/include/postgres.h:47:0,
 from label.c:11:
../../src/include/c.h:206:0: note: this is the location of the previous 
definition
 #define true ((bool) 1)

and likewise for "false".  So that mystery is explained.

I stand by my previous patch suggestion, except that we can replace
the parenthetical comment with something like "(We don't care if
 redefines "true"/"false"; those are close enough.)".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] scram and \password

2017-04-05 Thread Heikki Linnakangas


On 04/05/2017 06:53 PM, Robert Haas wrote:

On Sat, Mar 25, 2017 at 1:10 AM, Michael Paquier
 wrote:

On Fri, Mar 24, 2017 at 10:12 PM, Heikki Linnakangas  wrote:

On 03/24/2017 03:02 PM, Michael Paquier wrote:


In order to close this thread, I propose to reuse the patches I sent
here to make scram_build_verifier() available to frontends:

https://www.postgresql.org/message-id/CAB7nPqT4yc3u8wspYkWbG088Ndp6asMH3=zb___ck89ctvz...@mail.gmail.com

And on top of it modify \password so as it generates a md5 verifier
for pre-9.6 servers and a scram one for post-10 servers by looking at
the backend version of the current connection. What do you think?


Yep, sounds like a plan.


And attached is a set of rebased patches, with createuser and psql's
\password extended to do that.


Heikki, are you going to do something about these?  We're running out of time.


Sorry I've been procrastinating. I'm on it now. (We need to do something 
about this, feature freeze or not..)


At a quick glance, moving pg_frontend_random() to src/common looks like 
a non-starter. It uses pglock_thread() which is internal to libpq, so it 
won't compile as it is. I think I'm going to change 
scram_build_verifier() to take a pre-generated salt as argument, to 
avoid the need for a random number generator in src/common.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] partitioned tables and contrib/sepgsql

2017-04-05 Thread Peter Eisentraut

On 4/5/17 12:04, Tom Lane wrote:
> Conclusion: Fedora's gcc is playing fast and loose somehow with the
> command "#include "; that does not include the file
> you'd think it does, it does something magic inside the compiler.
> The magic evidently includes not complaining about duplicate macro
> definitions for true and false.

Perhaps -Wsystem-headers would change the outcome in your case.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementation of SASLprep for SCRAM-SHA-256

2017-04-05 Thread Heikki Linnakangas


On 04/05/2017 07:23 AM, Michael Paquier wrote:

fore

On Wed, Apr 5, 2017 at 7:05 AM, Heikki Linnakangas  wrote:

I will continue tomorrow, but I wanted to report on what I've done so far.
Attached is a new patch version, quite heavily modified. Notable changes so
far:


Great, thanks!


* Use Unicode codepoints, rather than UTF-8 bytes packed in a 32-bit ints.
IMHO this makes the tables easier to read (to a human), and they are also
packed slightly more tightly (see next two points), as you can fit more
codepoints in a 16-bit integer.


Using directly codepoints is not much consistent with the existing
backend, but for the sake of packing things more, OK.


Oh, I see, we already have similar functions in wchar.c. 
unicode_to_utf8() and utf8_to_unicode(). We should probably move those 
to src/common, rather than re-invent the wheel.



pg_utf8_islegal() and pg_utf_mblen() should as well be moved in their
own file I think, and wchar.c can use that.


Yeah..


* The list of characters excluded from recomposition is currently hard-coded
in utf_norm_generate.pl. However, that list is available in machine-readable
format, in file CompositionExclusions.txt. Since we're reading most of the
data from UnicodeData.txt, would be good to read the exclusion table from a
file, too.


Ouch. Those are present here...
http://www.unicode.org/reports/tr41/tr41-19.html#Exclusions
Definitely it makes more sense to read them from a file.


Did that.


* SASLPrep specifies normalization form KC, but it also specifies that some
characters are mapped to space or nothing. Should do those mappings, too.


Ah, right. Those ones are here:
https://tools.ietf.org/html/rfc3454#appendix-B.1


Yep.


Attached is a new version. Notable changes since yesterday:

* Implemented the rest of the SASLPrep, mapping some characters to 
spaces, leaving out others, and checking for prohibited characters and 
bidirectional strings.


* Moved things around. There's now a separate directory, 
src/common/unicode, which contains the perl scripts and the test code. 
Those are not needed to build from source, as the pre-generated tables 
are put in src/include/common. Similar to the scripts in 
src/backend/utils/mb/Unicode, really.


* Renamed many things from utf_* to unicode_*, since they don't deal 
with utf-8 input anymore.



This is starting to shape up, but still some cleanup work to do. I will 
continue tomorrow..


- Heikki



implement-SASLprep-3.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Functions Immutable but not parallel safe?

2017-04-05 Thread Tom Lane

Peter Eisentraut  writes:
> On 4/5/17 09:58, Robert Haas wrote:
>>> - Maybe add a check to opr_sanity to make sure the default set of
>>> functions is configured the way we want?

>> That seems like a good idea.

> patch

Looks sane to me, although I wonder if opr_sanity ought to be looking
for any other combinations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Functions Immutable but not parallel safe?

2017-04-05 Thread Robert Haas

On Wed, Apr 5, 2017 at 12:18 PM, Peter Eisentraut
 wrote:
> On 4/5/17 09:58, Robert Haas wrote:
>>> - Maybe add a check to opr_sanity to make sure the default set of
>>> functions is configured the way we want?
>> That seems like a good idea.
>
> patch

LGTM

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] partitioned tables and contrib/sepgsql

2017-04-05 Thread Tom Lane

Andres Freund  writes:
> GCC generally doesn't warn about macro redefinitions, if both
> definitions are equivalent.

But they're *not* equivalent.  c.h has

#define true((bool) 1)

whereas so far as I can see the  definition is just

#define true1

And if you put those two definitions together you will get a warning.
The lack of one with c.h +  means somebody is cheating.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG_GETARG_GISTENTRY?

2017-04-05 Thread Tom Lane

Robert Haas  writes:
> On Wed, Mar 29, 2017 at 11:54 PM, Tom Lane  wrote:
>> Andres Freund  writes:
>>> we have a good number of '(GISTENTRY *) PG_GETARG_POINTER(n)' in our
>>> code - looks a bit better & shorter to have PG_GETARG_GISTENTRY(n).

>> Should be PG_GETARG_GISTENTRY_P to match existing conventions,
>> otherwise +1

> I have never quite understood why some of those macros have _P or _PP
> on the end and others don't.

_P means "pointer to".  _PP was introduced later to mean "pointer to
packed (ie, possibly short-header) datum".  Macros that mean to fetch
pointers to pass-by-ref data, but aren't using either of those naming
conventions, are violating project conventions, not least because you
don't know what they're supposed to do with short-header varlena input.
If I had a bit more spare time I'd run around and change any such macros.

In short, if you are supposed to write

FOO  *val = PG_GETARG_FOO(n);

then the macro designer blew it, because the name implies that it
returns FOO, not pointer to FOO.  This should be

FOO  *val = PG_GETARG_FOO_P(n);

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 >

1 - 100 of 205 matches

Mail list logo