Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-13 Thread Kuntal Ghosh
On Mon, Jan 9, 2017 at 1:45 PM, Haribabu Kommi  wrote:
> Updated patch is attached.
>
I've a few comments about the patch.

+ This type can accept both 6 and 8 bytes length MAC addresses.
A 6 bytes length MAC address is internally converted to 8 bytes. We
should include this in the docs. Because output is always 8 bytes.
Otherwise, a user may be surprised with the output.

+#define hibits(addr) \
+  ((unsigned long)(((addr)->a<<24)|((addr)->b<<16)|((addr)->c<<8)|((addr)->d)))
+
+#define lobits(addr) \
+  ((unsigned long)(((addr)->e<<24)|((addr)->f<<16)|((addr)->g<<8)|((addr)->h)))
+
There should be some spacing.

+ if (!eight_byte_address)
+ {
+ d = 0xFF;
+ e = 0xFE;
+ }
You already have count variable. Just check (count != 8).

+ res *= 256 * 256;
+ res *= 256 * 256;
Bit shift operator can be used for this. For example: (res << 32).

-DATA(insert ( 403 macaddr_ops PGNSP PGUID 1984  829 t 0 ));
-DATA(insert ( 405 macaddr_ops PGNSP PGUID 1985  829 t 0 ));
+DATA(insert ( 403 macaddr_ops PGNSP PGUID 1984  829 t 0 ));
+DATA(insert ( 405 macaddr_ops PGNSP PGUID 1985  829 t 0 ));
This is unnecessary I guess.

There was some whitespace error while applying the patch. Also, there
are some spacing inconsistencies in the comments. I've not tested with
this patch. I'll let you know once I'm done testing.


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing matching of boolean index columns to sort ordering

2017-01-13 Thread Michael Paquier
On Fri, Jan 13, 2017 at 10:29 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> And actually, contrary to what is mentioned upthread, the planner is
>> not able to avoid a sort phase if other data types are used, say:
>> =# create table foo (a int, b int);
>> CREATE TABLE
>> =# create index on foo (a, b);
>> CREATE INDEX
>> =# explain (costs off) select * from foo where a = 1 order by b limit 10;
>
> No, there's a difference between "not able to" and "chooses not to".
> In this example case, it just thinks a bitmap scan is cheaper than
> an ordered scan:
>
> The problem with the boolean-column case is it fails to recognize
> that the index matches at all.

Bah. I was sure I was missing something, still I would have thought
that the index scan is cheaper than the bitmap index scan with ORDER
BY. As far as I can see, this patch is not the most elegant thing, but
it has value. So marked as "ready for committer".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2017-01-13 Thread Fabien COELHO



$ cat loop.sql
\if :x < 1000
 \echo :x
 \set x :x + 1
 \include loop.sql
\fi
$ psql --set x=0 -f loop.sql


Nice one! CPP does not have arithmetic, so it is harder to do that because 
one must reimplement arithmetic with #if...



Somebody is going to think of that workaround for not having loops, and
then whine about how psql runs out of file descriptors and/or stack.


One can already have "include nested too deeply" errors, I guess, without 
a recursion.


I would say that's this consequence is acceptable, and that this is a 
feature.


I think having some kind of client-side test brings significant value 
because it would help writing application schema upgrades for instance, 
and that the this potential whining source is worth handling.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2017-01-13 Thread Tom Lane
Pavel Stehule  writes:
> 2017-01-14 0:20 GMT+01:00 Corey Huinker :
>> - leaving loops out for now?

> +1

I'm just going to say one thing about that: some people will remember
that you can build a Turing machine with either conditionals+iteration
or conditionals+recursion.  I wonder what depth of include-file nesting
psql can support, or whether we'll be able to fix it to optimize tail
recursion of an include file.  Because somebody will be asking for that
if this is the toolset you give them.

regards, tom lane

PS: if I'm being too obscure for you, consider:

$ cat loop.sql
\if :x < 1000
  \echo :x
  \set x :x + 1
  \include loop.sql
\fi
$ psql --set x=0 -f loop.sql

Somebody is going to think of that workaround for not having loops, and
then whine about how psql runs out of file descriptors and/or stack.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how to correctly invalidate a constraint?

2017-01-13 Thread Pavel Stehule
2017-01-13 22:44 GMT+01:00 Alvaro Herrera :

> Pavel Stehule wrote:
> > Hi
> >
> > I would to do import without RI check - so I disable RI triggers.
> >
> > I would to invalidate constraint on Foreign Keys and then I would to use
> > ALTER TABLE VALIDATE CONSTRAINT ...
> >
> > I didn't find how to invalidate constraint without direct update
> > pg_constraint.
> >
> > Is there some clean way?
>
> I think what you want is:
> - set the constraint as "not valid", so that the following is a valid
>   operation
> - set the RI trigger not to fire, to improve performance of bulk loads
> - do the load
> - activate the trigger
> - validate the constraint
>

yes


>
> We have SQL commands for everything except the first step.  Now my
> question would be: do we want to support that operation as a stand-alone
> thing so that you can construct the above from pieces, or do we want
> some higher-level command so that the above is less cumbersome?  The
> main issue I see is that a single constraint involves several triggers,
> and the triggers have internally-derived, very ugly names.  So in my
> mind the right way to provide this functionality is to have a command
> that operates on the RI constraint and modifies the triggers status.
>
> ALTER TABLE .. ALTER CONSTRAINT [name / ALL] DEACTIVATE
>-- sets constraint as NOT VALID, also sets triggers inactive
>
> [user bulkload occurs here]
>
> ALTER TABLE .. ALTER CONSTRAINT [name / ALL] ACTIVATE
>-- activates triggers, validates constraint
>

In this case I prefer simple low level command

ALTER TABLE .. ALTER CONSTRAINT name NOT VALID

It should to set catalog to state after new NOT VALID constraint.

Regards

Pavel


>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] Packages: Again

2017-01-13 Thread Serge Rielau
On Fri, Jan 13, 2017 at 4:24 PM, Peter Geoghegan  wrote:
On Fri, Jan 13, 2017 at 3:44 PM, Serge Rielau  wrote:
> And sometimes the community DOES go its own way rather than implementing the 
> standard. For example by rejecting the MERGE statement in favor of another 
> syntax and semantic.

That's total nonsense.

MERGE isn't UPSERT…. Peter, you are misreading what I wrote. I did not allege 
that PostgreSQL did the wrong thing. And you are essentially confirming that 
there was debate and MERGE deemed to be not what was wanted. So PG, with 
reason, went with something not in the standard.
That is precisely my point!
Packages are not modules are not nested schemata either. And the argument that 
nested schemata are a no-go because of the standard is invalid for the same 
reason discarding an option other than MERGE because that’s the only thing in 
the standard was invalid.
But what irks me in this debate is that any reasoned and detailed argumentation 
of value of the principle itself is shut down with un-reasoned and un-detailed 
one-liners. “I’m not convinced” is not an argument. Counterpoints require 
content. Something starting with “because …”
If the community does not believe that there is value in a more refined 
grouping of objects than a schema the discussion in DOA. If there is consensus 
that there is value one can debate about the best semantics and language 
covering it.
Cheers Serge

Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2017-01-13 Thread Pavel Stehule
Hi

2017-01-14 0:20 GMT+01:00 Corey Huinker :

> Ok, so activity on this thread has died down. I'm not sure if that's
> consensus or exhaustion.
>

the original idea \quit_if is leaved? It is pity - it is common use case -
and because we cannot to implement macros in psql, then can be very useful


>
> Are we good with:
> - implementing basic \if EXPR \elseif EXPR \else \endif, where the EXPR is
> an expression but is currently limited to a simple string that will be
> evaluated for truth via ParseVariableBool()?
>

+1


> - moving beyond trivial expressions in a later commit?
>

the expressions are in nice to have category - there can be a logic

 if there is only a variable, check the variable; else eval on server and
check the result.


> - leaving loops out for now?
>

+1

Regards

Pavel


Re: [HACKERS] [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-13 Thread Amit Langote
On Sat, Jan 14, 2017 at 6:10 AM, Robert Haas  wrote:
> On Fri, Jan 13, 2017 at 3:09 PM, Alvaro Herrera  
> wrote:
>>
>> I'm just saying that the problem at hand is already solved for a related
>> feature, so ISTM this new code should use the recently added routine
>> rather than doing the same thing in a different way.
>
> Oh, I see.  Amit, thoughts?

Hm, perhaps.  The code in map_partition_varattnos() that creates the
map could be replaced by a call to the new
convert_tuples_by_name_map().  In fact, it could even have used the
old version of it (convert_tuples_by_name()).  I guess I just aped
what other callers of map_variable_attnos() were doing, which is to
generate the map themselves (not that they ought to be changed to use
convert_tuples_by_name_map).

I will send a patch at my earliest convenience. Thanks to Alvaro for
pointing that out.

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Tomas Vondra

On 01/14/2017 12:06 AM, Andres Freund wrote:

Hi,


On 2017-01-13 17:58:41 -0500, Tom Lane wrote:

But, again, the catcache isn't the only source of per-process bloat
and I'm not even sure it's the main one.  A more holistic approach
might be called for.


It'd be helpful if we'd find a way to make it easy to get statistics
about the size of various caches in production systems. Right now
that's kinda hard, resulting in us having to make a lot of
guesses...



What about a simple C extension, that could inspect those caches? 
Assuming it could be loaded into a single backend, that should be 
relatively acceptable way (compared to loading it to all backends using 
shared_preload_libraries).



--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Peter Geoghegan
On Fri, Jan 13, 2017 at 3:44 PM, Serge Rielau  wrote:
> And sometimes the community DOES go its own way rather than implementing the 
> standard. For example by rejecting the MERGE statement in favor of another 
> syntax and semantic.

That's total nonsense.

MERGE isn't UPSERT, and isn't even in competition with UPSERT as a
feature. I've written reams of text explaining why this is so in
precise detail, with reference to the implementations of all other
major systems [1][2]. The general consensus is that we might one day
have both UPSERT and MERGE, just like Teradata. They really are that
different that that would be perfectly reasonable. Any application
that uses MERGE and assumes UPSERT-like guarantees should be assumed
broken. We didn't diverge from the SQL standard on a whim. This was
discussed, on and off, for over a year.

[1] https://wiki.postgresql.org/wiki/UPSERT#MERGE_disadvantages
[2] 
https://www.postgresql.org/message-id/CAM3SWZRP0c3g6+aJ=yydgyactzg0xa8-1_fcvo5xm7hrel3...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Serge Rielau
On Fri, Jan 13, 2017 at 2:46 PM, Kevin Grittner  wrote:
On Fri, Jan 13, 2017 at 12:35 PM, Serge Rielau  wrote:

> Yes my proposal to nest schemata is “radical” and this community
> is not falling into that camp.
> But there is nothing holy about database.schema.object.attribute

It is mandated by the U.S. and international SQL standard documents. Compliance 
to the standard does not prohibit extensions to the standard. That is, in fact, 
how the standard is progressed.
The SQL/PSM standard introduced another “dot”: database.schema.module.object
And sometimes the community DOES go its own way rather than implementing the 
standard. For example by rejecting the MERGE statement in favor of another 
syntax and semantic.
Cheers Serge

Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2017-01-13 Thread Corey Huinker
Ok, so activity on this thread has died down. I'm not sure if that's
consensus or exhaustion.

Are we good with:
- implementing basic \if EXPR \elseif EXPR \else \endif, where the EXPR is
an expression but is currently limited to a simple string that will be
evaluated for truth via ParseVariableBool()?
- moving beyond trivial expressions in a later commit?
- leaving loops out for now?


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Andres Freund
Hi,


On 2017-01-13 17:58:41 -0500, Tom Lane wrote:
> But, again, the catcache isn't the only source of per-process bloat
> and I'm not even sure it's the main one.  A more holistic approach
> might be called for.

It'd be helpful if we'd find a way to make it easy to get statistics
about the size of various caches in production systems. Right now that's
kinda hard, resulting in us having to make a lot of guesses...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Off-by-one oddity in minval for decreasing sequences

2017-01-13 Thread Michael Paquier
On Sat, Jan 7, 2017 at 4:15 AM, Daniel Verite  wrote:
> The defaults comes from these definitions, in include/pg_config_manual.h
>
> /*
>  * Set the upper and lower bounds of sequence values.
>  */
> #define SEQ_MAXVALUEPG_INT64_MAX
> #define SEQ_MINVALUE(-SEQ_MAXVALUE)
>
> with no comment as to why SEQ_MINVALUE is not PG_INT64_MIN.

Bruce likely does not remember 8000fdd4 from 2003. I stopped tracking there.

> When using other types than bigint, Peter's patch fixes the inconsistency
> but also makes it worse by ISTM applying the rule that the lowest value
> is forbidden for int2 and int4 in addition to int8.

I think that mapping a sequence to a data type gives a good argument
to have the range mapping with the data type involved. That's less
surprise for the users, and less code to think about range boundaries.
Also, restoring from a dump sequences won't result in an error for
sequence definitions using PG_INT**_MIN as minimum value.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Tom Lane
Michael Paquier  writes:
> ... There are even some apps that do not use pgbouncer, but drop
> sessions after a timeout of inactivity to avoid a memory bloat because
> of the problem of this thread.

Yeah, a certain company I used to work for had to do that, though their
problem had more to do with bloat in plpgsql's compiled-functions cache
(and ensuing bloat in the plancache), I believe.

Still, I'm pretty suspicious of anything that will add overhead to
catcache lookups.  If you think the performance of those is not absolutely
critical, turning off the caches via -DCLOBBER_CACHE_ALWAYS will soon
disabuse you of the error.

I'm inclined to think that a more profitable direction to look in is
finding a way to limit the cache size.  I know we got rid of exactly that
years ago, but the problems with it were (a) the mechanism was itself
pretty expensive --- a global-to-all-caches LRU list IIRC, and (b) there
wasn't a way to tune the limit.  Possibly somebody can think of some
cheaper, perhaps less precise way of aging out old entries.  As for
(b), this is the sort of problem we made GUCs for.

But, again, the catcache isn't the only source of per-process bloat
and I'm not even sure it's the main one.  A more holistic approach
might be called for.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Kevin Grittner
On Fri, Jan 13, 2017 at 12:35 PM, Serge Rielau  wrote:

> Yes my proposal to nest schemata is “radical” and this community
> is not falling into that camp.
> But there is nothing holy about database.schema.object.attribute

It is mandated by the U.S. and international SQL standard documents.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-13 Thread Michael Paquier
On Sat, Jan 14, 2017 at 1:43 AM, Kevin Grittner  wrote:
> On Fri, Jan 13, 2017 at 7:09 AM, Michael Paquier
>  wrote:
>> There is no real harm in including current_logfiles in base
>> backups, but that's really in the same bag as postmaster.opts or
>> postmaster.pid, particularly if the log file name has a
>> timestamp.
>
> I'm going to dispute that -- if postmaster.opts and postmaster.pid
> are present when you restore, it takes away a level of insurance
> against restoring a corrupted image of the database without knowing
> it.  In particular, if the backup_label file is deleted (which
> happens with alarmingly frequency), the startup code may think it
> is dealing with a cluster that crashed rather than with a restore
> of a backup.  This often leads to corruption (anything from
> "database can't start" to subtle index corruption that isn't
> noticed for months).  The presence of log files from the time of
> the backup do not present a similar hazard.
>
> So while I agree that there is no harm in including
> current_logfiles in base backups, I object to the comparisons to
> the more dangerous files.

Good point.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Michael Paquier
On Sat, Jan 14, 2017 at 12:32 AM, Robert Haas  wrote:
> On Fri, Jan 13, 2017 at 8:58 AM, Tom Lane  wrote:
>> Michael Paquier  writes:
>>> Have there been ever discussions about having catcache entries in a
>>> shared memory area? This does not sound much performance-wise, I am
>>> just wondering about the concept and I cannot find references to such
>>> discussions.
>>
>> I'm sure it's been discussed.  Offhand I remember the following issues:
>>
>> * A shared cache would create locking and contention overhead.
>>
>> * A shared cache would have a very hard size limit, at least if it's
>> in SysV-style shared memory (perhaps DSM would let us relax that).
>>
>> * Transactions that are doing DDL have a requirement for the catcache
>> to reflect changes that they've made locally but not yet committed,
>> so said changes mustn't be visible globally.
>>
>> You could possibly get around the third point with a local catcache that's
>> searched before the shared one, but tuning that to be performant sounds
>> like a mess.  Also, I'm not sure how such a structure could cope with
>> uncommitted deletions: delete A -> remove A from local catcache, but not
>> the shared one -> search for A in local catcache -> not found -> search
>> for A in shared catcache -> found -> oops.
>
> I think the first of those concerns is the key one.  If searching the
> system catalogs costs $100 and searching the private catcache costs
> $1, what's the cost of searching a hypothetical shared catcache?  If
> the answer is $80, it's not worth doing.  If the answer is $5, it's
> probably still not worth doing.  If the answer is $1.25, then it's
> probably worth investing some energy into trying to solve the other
> problems you list.  For some users, the memory cost of catcache and
> syscache entries multiplied by N backends are a very serious problem,
> so it would be nice to have some other options.  But we do so many
> syscache lookups that a shared cache won't be viable unless it's
> almost as fast as a backend-private cache, or at least that's my
> hunch.

Being able to switch from one mode to another would be interesting.
Applications using extensing DDLs that require to change the catcache
with an exclusive lock would clearly pay the lock contention cost, but
do you think that be really the case of a shared lock? A bunch of
applications that I work with deploy Postgres once, then don't change
the schema except when an upgrade happens. So that would be benefitial
for that. There are even some apps that do not use pgbouncer, but drop
sessions after a timeout of inactivity to avoid a memory bloat because
of the problem of this thread. That won't solve the problem of the
local catcache bloat, but some users using few DDLs may be fine to pay
some extra concurrency cost if the session handling gets easied.

> I think it would be interested for somebody to build a prototype here
> that ignores all the problems but the first and uses some
> straightforward, relatively unoptimized locking strategy for the first
> problem.  Then benchmark it.  If the results show that the idea has
> legs, then we can try to figure out what a real implementation would
> look like.
> (One possible approach: use Thomas Munro's DHT stuff to build the shared 
> cache.)

Yeah, I'd bet on a couple of days of focus to sort that out.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Fabien COELHO



Here's a v2 that does it like that.


Applies, compiles, overall & pgss make check are both ok.

Personnally I find having RawStmt only for the top parsed statement 
cleaner, and it significantly reduces changes in the parser, as expected, 
without inducing too much changes elsewhere.


Same comment as on v1: I'm wondering about the actual coverage of the 
default tests... I cannot say I understand everything.


It ends up being about 30 fewer lines of code overall, despite there 
being four places that have to make ad-hoc RawStmt nodes.  On the whole 
I feel like this is cleaner,


I agree: Better typing, more homogeneous code (PlannedStmt for all), 
less ad-hoc checks to work around utility statements...


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-13 Thread Alvaro Herrera
Jim Nasby wrote:
> On 1/10/17 1:53 AM, Alexander Korotkov wrote:
> > 1. What project ideas we have?
> 
> Perhaps allowing SQL-only extensions without requiring filesystem files
> would be a good project.

Don't we already have that in patch form?  Dimitri submitted it as I
recall.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how to correctly invalidate a constraint?

2017-01-13 Thread Alvaro Herrera
Pavel Stehule wrote:
> Hi
> 
> I would to do import without RI check - so I disable RI triggers.
> 
> I would to invalidate constraint on Foreign Keys and then I would to use
> ALTER TABLE VALIDATE CONSTRAINT ...
> 
> I didn't find how to invalidate constraint without direct update
> pg_constraint.
> 
> Is there some clean way?

I think what you want is:
- set the constraint as "not valid", so that the following is a valid
  operation
- set the RI trigger not to fire, to improve performance of bulk loads
- do the load
- activate the trigger
- validate the constraint

We have SQL commands for everything except the first step.  Now my
question would be: do we want to support that operation as a stand-alone
thing so that you can construct the above from pieces, or do we want
some higher-level command so that the above is less cumbersome?  The
main issue I see is that a single constraint involves several triggers,
and the triggers have internally-derived, very ugly names.  So in my
mind the right way to provide this functionality is to have a command
that operates on the RI constraint and modifies the triggers status.

ALTER TABLE .. ALTER CONSTRAINT [name / ALL] DEACTIVATE
   -- sets constraint as NOT VALID, also sets triggers inactive

[user bulkload occurs here]

ALTER TABLE .. ALTER CONSTRAINT [name / ALL] ACTIVATE
   -- activates triggers, validates constraint

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2017-01-13 Thread Jeff Janes
On Fri, Jan 13, 2017 at 3:22 AM, Etsuro Fujita 
wrote:

> On 2017/01/13 0:43, Robert Haas wrote:
>
>> On Wed, Jan 11, 2017 at 2:45 AM, Etsuro Fujita
>>  wrote:
>>
>>> As I said before, that might be fine for 9.6, but I don't think it's a
>>> good
>>> idea to search the pathlist because once we support parameterized foreign
>>> join paths, which is on my TODOs, we would have to traverse through the
>>> possibly-lengthy pathlist to find a local-join path, as mentioned in [3].
>>>
>>
> I'm not sure that's really going to be a problem.  The number of
>> possible parameterizations that need to be considered isn't usually
>> very big.  I bet the path list will have ten or a few tens of entries
>> in it, not a hundred or a thousand.  Traversing it isn't that
>> expensive.
>>
>> That having been said, I haven't read the patches, so I'm not really
>> up to speed on the bigger issues here.  But surely it's more important
>> to get the overall design right than to worry about the cost of
>> walking the pathlist or worrying about the cost of an extra function
>> call (?!).
>>
>
> My biggest concern about GetExistingLocalJoinPath is that might not be
> extendable to the case of foreign-join paths with parameterization; in
> which case, fdw_outerpath for a given foreign-join path would need to have
> the same parameterization as the foreign-join path, but there might not be
> any existing paths with the same parameterization in the path list.  You
> might think we could get the fdw_outerpath by getting an existing path with
> no parameterization as in GetExistingLocalJoinPath and then modifying the
> path's param_info to match the parameterization of the foreign-join path.
> I don't know that really works, but that might be inefficient.
>
> What I have in mind to support foreign-join paths with parameterization
> for postgres_fdw like this: (1) generate parameterized paths from any
> joinable combination of the outer/inner cheapest-parameterized paths that
> have pushed down the outer/inner relation to the remote server, in a
> similar way as postgresGetForeignJoinPaths creates unparameterized paths,
> and (2) create fdw_outerpath for each parameterized path from the
> outer/inner paths used to generate the parameterized path, by
> create_nestloop_path (or, create_hashjoin_path or create_mergejoin_path if
> full join), so that the resulting fdw_outerpath has the same
> parameterization as the paramterized path.  This would probably work and
> might be more efficient.  And the patch I proposed would be easily extended
> to this, by replacing the outer/inner cheapest-total paths with the
> outer/inner cheapest-parameterized paths.  Attached is the latest version
> of the patch.
>


I'm afraid this discussion and the C code here are over my head.  But I've
tested this patch against 9.6.stable-2d443ae1b0121e15265864d2b21 and
10.dev-0563a3a8b59150bf3cc8, and in both cases it fixes the original
problem.

I do get a compiler warning:

foreign.c: In function 'CreateLocalJoinPath':
foreign.c:832: warning: implicit declaration of function
'pathkeys_contained_in'


Cheers,

Jeff


Re: [HACKERS] Packages: Again

2017-01-13 Thread Pavel Stehule
2017-01-13 22:16 GMT+01:00 Serge Rielau :

> On Fri, Jan 13, 2017 at 12:45 PM, Pavel Stehule 
> wrote:
>
> show patch and show a advantages against schema, please.
>
> I have tried to describe the advantage.
> If the community doesn’t agree, that’s fine.
> I do not see how expending the effort of back porting a patch (and getting
> clearance for that from my employer) will enhance my argument.
>

This theme is opened, and any discussion is welcome - it is not simple
task, to design any solution that be natural in PostgreSQL environment

Regards

Pavel


>
> Cheers
> Serge
>
>
>


Re: [HACKERS] Declarative partitioning - another take

2017-01-13 Thread Robert Haas
On Tue, Jan 10, 2017 at 6:06 AM, Amit Langote
 wrote:
> On 2017/01/05 5:50, Robert Haas wrote:
>> On Tue, Dec 27, 2016 at 3:59 AM, Amit Langote
>>  wrote:
>>> Patches 0001 to 0006 unchanged.
>>
>> Committed 0001 earlier, as mentioned in a separate email.  Committed
>> 0002 and part of 0003.
>
> Thanks!  I realized however that the approach I used in 0002 of passing
> the original slot to ExecConstraints() fails in certain situations.  For
> example, if a BR trigger changes the tuple, the original slot would not
> receive those changes, so it will be wrong to use such a tuple anymore.
> In attached 0001, I switched back to the approach of converting the
> partition-tupdesc-based tuple back to the root partitioned table's format.
>  The converted tuple contains the changes by BR triggers, if any.  Sorry
> about some unnecessary work.

Hmm.  Even with this patch, I wonder if this is really correct.  I
mean, isn't the root of the problem here that ExecConstraints() is
expecting that resultRelInfo matches slot, and it doesn't? And why
isn't that also a problem for the things that get passed resultRelInfo
and slot after tuple routing and before ExecConstraints?  In
particular, in copy.c, ExecBRInsertTriggers.

Committed 0002.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade vs. pg_ctl stop -m fast

2017-01-13 Thread Peter Eisentraut
On 1/13/17 9:45 AM, Peter Eisentraut wrote:
> On 1/12/17 9:43 PM, Bruce Momjian wrote:
>> On Thu, Jan 12, 2017 at 10:17:52AM -0500, Tom Lane wrote:
>>> Peter Eisentraut  writes:
 In 9.5, the default pg_ctl stop mode was changed from "smart" to "fast".
 In pg_upgrade, there is this code:
 ...
 I think the last line should be changed to something like
   fast ? "-m fast" : "-m smart");
>>>
>>> Ugh.  Clear oversight.
>>>
>>> There is maybe room for a separate discussion about whether pg_upgrade
>>> *should* be using fast mode, but if so we could remove the "bool fast"
>>> argument from this function altogether.
>>
>> Agreed, it should be remove.  Should I do it?
> 
> For 9.5 and 9.6, I think we should backpatch what I suggested above, to
> minimize the behavior change.

I have committed that (including to master).

> For master we can consider removing the
> distinction and just use fast shutdown all the time, but I haven't
> checked all the possible implications of that change.

I'm not planning to work on this at this time.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Serge Rielau
On Fri, Jan 13, 2017 at 12:45 PM, Pavel Stehule  
wrote: show patch and show a advantages against schema, please. I have tried to 
describe the advantage. If the community doesn’t agree, that’s fine. I do not 
see how expending the effort of back porting a patch (and getting clearance for 
that from my employer) will enhance my argument.
Cheers Serge

Re: [HACKERS] [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 3:09 PM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> On Fri, Jan 13, 2017 at 2:31 PM, Robert Haas  wrote:
>> > On Fri, Jan 13, 2017 at 2:18 PM, Alvaro Herrera
>> >  wrote:
>
>> >> Hmm, we were discussing this stuff a few days ago, see
>> >> https://www.postgresql.org/message-id/20170109182800.qrkae62kmur3gfeg@alvherre.pgsql
>> >> and commit 3957b58b8885441c8d03bc1cfc00e47cf8cd7975.  Part of this code
>> >> duplicates that ...
>> >
>> > Is that bad?
>>
>> If you are expressing a concern about who wrote this code, I took
>> Amit's word for it that he did.
>
> I'm just saying that the problem at hand is already solved for a related
> feature, so ISTM this new code should use the recently added routine
> rather than doing the same thing in a different way.

Oh, I see.  Amit, thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-13 Thread Peter van Hardenberg
A new data type, and/or a new index type could both be nicely scoped bits
of work.

On Thu, Jan 12, 2017 at 12:27 PM, Pavel Stehule 
wrote:

>
>
> 2017-01-12 21:21 GMT+01:00 Jim Nasby :
>
>> On 1/10/17 1:53 AM, Alexander Korotkov wrote:
>>
>>> 1. What project ideas we have?
>>>
>>
>> Perhaps allowing SQL-only extensions without requiring filesystem files
>> would be a good project.
>>
>
> Implementation safe evaluation untrusted PL functions - evaluation under
> different user under different process.
>
> Regards
>
> Pavel
>
>
>
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
>> Experts in Analytics, Data Architecture and PostgreSQL
>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>> 855-TREBLE2 (855-873-2532)
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>


-- 
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut


Re: [HACKERS] gettimeofday is at the end of its usefulness?

2017-01-13 Thread Tom Lane
Florian Weimer  writes:
> glibc has some test cases which fail because clock_gettime gives
> inconsistent results.  This has been fixed in current kernels, but I
> don't know if everyone uses them.

hmm, for which clock IDs?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettimeofday is at the end of its usefulness?

2017-01-13 Thread Florian Weimer
* Tom Lane:

> Florian Weimer  writes:
>> * Tom Lane:
>>> On Linux (RHEL6, 2.4GHz x86_64), I find that gettimeofday(),
>>> clock_gettime(CLOCK_MONOTONIC), and clock_gettime(CLOCK_REALTIME)
>>> all take about 40ns.  Of course gettimeofday() only has 1us resolution,
>>> but the other two have perhaps 10ns resolution (I get no duplicate
>>> readings in a tight loop).
>
>> Isn't this very specific to kernel and glibc versions, depending on
>> things like CONFIG_HZ settings and what level of vDSO support has been
>> backported?
>
> No doubt, but I have yet to find a platform where clock_gettime() exists
> but performs worse than gettimeofday().  Do you know of one?

ppc64le with all the vDSO fixes for clock_gettime?

glibc has some test cases which fail because clock_gettime gives
inconsistent results.  This has been fixed in current kernels, but I
don't know if everyone uses them.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Pavel Stehule
2017-01-13 20:38 GMT+01:00 Serge Rielau :

>
> > On Jan 13, 2017, at 11:11 AM, Pavel Stehule 
> wrote:
> >
> > With Postgres we should to think much more about other PL - there is not
> only PL/pgSQL. So any what we create should be available for any PL. Our
> PLpgSQL is based on total different technology design - so some benefits of
> sharing compiled code across databases has not too value in Postgres.
> Let me stress one last point:
> MODULE’s are 100% orthogonal to PLpgSQL as implement by SFDC and also
> orthogonal to SQL PL as implemented by DB2.
> Modules can (and do for us) contain C-functions of example.
> Similarly when the community provides provides server side session
> variables I have no doubt they will integrate with MODULE’s with very
> little work.
>
> It’s a DDL and name resolution game, predominantly
>

show patch and show a advantages against schema, please.

Regards

Pavel


> Cheers
> Serge


[HACKERS] Re: [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-13 Thread Alvaro Herrera
Robert Haas wrote:
> On Fri, Jan 13, 2017 at 2:31 PM, Robert Haas  wrote:
> > On Fri, Jan 13, 2017 at 2:18 PM, Alvaro Herrera
> >  wrote:

> >> Hmm, we were discussing this stuff a few days ago, see
> >> https://www.postgresql.org/message-id/20170109182800.qrkae62kmur3gfeg@alvherre.pgsql
> >> and commit 3957b58b8885441c8d03bc1cfc00e47cf8cd7975.  Part of this code
> >> duplicates that ...
> >
> > Is that bad?
> 
> If you are expressing a concern about who wrote this code, I took
> Amit's word for it that he did.

I'm just saying that the problem at hand is already solved for a related
feature, so ISTM this new code should use the recently added routine
rather than doing the same thing in a different way.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Serge Rielau

> On Jan 13, 2017, at 11:11 AM, Pavel Stehule  wrote:
> 
> With Postgres we should to think much more about other PL - there is not only 
> PL/pgSQL. So any what we create should be available for any PL. Our PLpgSQL 
> is based on total different technology design - so some benefits of sharing 
> compiled code across databases has not too value in Postgres.
Let me stress one last point:
MODULE’s are 100% orthogonal to PLpgSQL as implement by SFDC and also 
orthogonal to SQL PL as implemented by DB2.
Modules can (and do for us) contain C-functions of example.
Similarly when the community provides provides server side session variables I 
have no doubt they will integrate with MODULE’s with very little work.

It’s a DDL and name resolution game, predominantly

Cheers
Serge

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 2:31 PM, Robert Haas  wrote:
> On Fri, Jan 13, 2017 at 2:18 PM, Alvaro Herrera
>  wrote:
>> Robert Haas wrote:
>>> Fix a bug in how we generate partition constraints.
>>>
>>> Move the code for doing parent attnos to child attnos mapping for Vars
>>> in partition constraint expressions to a separate function
>>> map_partition_varattnos() and call it from the appropriate places.
>>
>> Hmm, we were discussing this stuff a few days ago, see
>> https://www.postgresql.org/message-id/20170109182800.qrkae62kmur3gfeg@alvherre.pgsql
>> and commit 3957b58b8885441c8d03bc1cfc00e47cf8cd7975.  Part of this code
>> duplicates that ...
>
> Is that bad?

If you are expressing a concern about who wrote this code, I took
Amit's word for it that he did.  His patch file says:

Reported by: n/a
Patch by: Amit Langote
Reports: n/a

If that ain't right, that's bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 2:18 PM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> Fix a bug in how we generate partition constraints.
>>
>> Move the code for doing parent attnos to child attnos mapping for Vars
>> in partition constraint expressions to a separate function
>> map_partition_varattnos() and call it from the appropriate places.
>
> Hmm, we were discussing this stuff a few days ago, see
> https://www.postgresql.org/message-id/20170109182800.qrkae62kmur3gfeg@alvherre.pgsql
> and commit 3957b58b8885441c8d03bc1cfc00e47cf8cd7975.  Part of this code
> duplicates that ...

Is that bad?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-13 Thread Alvaro Herrera
Robert Haas wrote:
> Fix a bug in how we generate partition constraints.
> 
> Move the code for doing parent attnos to child attnos mapping for Vars
> in partition constraint expressions to a separate function
> map_partition_varattnos() and call it from the appropriate places.

Hmm, we were discussing this stuff a few days ago, see
https://www.postgresql.org/message-id/20170109182800.qrkae62kmur3gfeg@alvherre.pgsql
and commit 3957b58b8885441c8d03bc1cfc00e47cf8cd7975.  Part of this code
duplicates that ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Alvaro Herrera
Robert Haas wrote:

> From a technical point of view, that is pretty much true.  But from a
> user perspective, I don't think it is.  If the old names live in an
> extension, then they will not be there by default.  Tools and scripts
> will break.  If that's a problem for a particular user, they an
> install the extension to unbreak those things, but tool authors will
> probably be motivated to upgrade their tools so that the extension is
> no longer needed.  Therefore, if we eventually drop the extension,
> there will probably be few users relying on it at that point.  On the
> other hand, if we install aliases that are present in every install -
> and that are not even optionally removable - a lot of people are not
> going to bother using the new names at all.  They'll just keep on
> using the old ones so that their tools work with both old and new
> versions, and if we eventually rip the stuff out of pg_proc.h it will
> cause almost as much howling as if we'd done it right at the
> beginning.
> 
> In other words, adding an extension is the only option that I see
> which will give tool authors a strong incentive to support the new
> names without making it a hard requirement on day one.

A compatibility extension sounds like a pretty decent idea to me,
considering this point.  I think breakage without a workaround will
cause people to defer migration more than they otherwise would; the
extension gives them an easy way forward while they set up their
internal code to the new reality, to test the other new things in pg10.

If we keep the extension forever, so what?  It's not like tsearch2
causes us too much of a maintenance pain.  It's been patched a few
times, but only to change things that wouldn't need changes in a
xlog->wal aliases-only module.  Anyway, I would mark the module as
deprecated with an intent of removing it in half a dozen releases or so.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Pavel Stehule
2017-01-13 19:35 GMT+01:00 Serge Rielau :

>
> > On Jan 13, 2017, at 10:23 AM, Pavel Stehule 
> wrote:
> >
> > I have not clean feeling from this - I am pretty sure so I am afraid
> schizophrenic  between MODULES, SCHEMAS. Nested schemas increase complexity
> of searching complexity and breaks a logic database.schema.object
> Yes my proposal to nest schemata is “radical” and this community is not
> falling into that camp.
> But there is nothing holy about database.schema.object.attribute
>

sure not - but lot of logic in SQL parser and PLpgSQL parser is based on it.


> .
> >
> > Currently almost all in PostgreSQL PL design is primitive, but that
> means pretty simple too.
> We are having > 30,000 functions with a total of millions of lines of code.
>

I understand so your working life is pretty hard :).


>
> You are describing a self fulfilling prophecy here.
> As long as the community codebase only caters to its existing users you
> will not see a change in the usage pattern of the base.
>
> > It is hard to see a advantages of this proposal.
>
> At least I tried :-)
>

I would be careful to translate a Oracle concept to PostgreSQL - The
packages is Ada language concept - and there has clean role. In PL/SQL
Oracle used packages like we used schema because Oracle has different
concept of terms "database", of term "schema".  The packages in Postgres is
more redundant than in Oracle. More the packages and related features are
probably most difficult PL/SQL feature. Lot of people don't understand well
- and it is not surprise for me, because PL/SQL is really hard mix of two
very different worlds.

I agree so there is some gap - there is nothing like package variables,
package constants.  Can be nice to fill it.

With Postgres we should to think much more about other PL - there is not
only PL/pgSQL. So any what we create should be available for any PL. Our
PLpgSQL is based on total different technology design - so some benefits of
sharing compiled code across databases has not too value in Postgres.

Maybe I am starting be old :) - I don't believe so stronger tools helps do
things better - like Java - some applications are pretty good, some are the
big heap of shit - and due strong language this heap is sometimes pretty
big :)

If you need nested schemas, maybe you do some scary things, that should be
better solved in application server.

So I am not 100% against, but really I am sceptic if it is good idea. We
don't design Postgres as platform for migration legacy Oracle code - on
second hand we should not to create artificial breaks against this
migrations.

Regards

Pavel





> Serge
>
>
>
>


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Tom Lane
Fabien COELHO  writes:
>> One thing that I'm not quite satisfied with is the business with
>> non-top-level RawStmt nodes in some utility statements. 
>> ...
>> So I'm now thinking that it might be better if the grammar produced
>> RawStmt only at top level, and anybody who calls pg_analyze_and_rewrite
>> on sub-sections of a utility statement has to cons up a RawStmt to put
>> at the top of the sub-query.

> Why not. The lazy programmer I am notices that there seems to be 6 
> instances, this is not too bad, some of which are already dealt with. The 
> RawStmt may not need to be allocated dynamically, a stack instance could 
> be enough.

Here's a v2 that does it like that.  It ends up being about 30 fewer lines
of code overall, despite there being four places that have to make ad-hoc
RawStmt nodes.  On the whole I feel like this is cleaner, although there's
room to argue that.  Still, the extra cruft is in places that I'm
suspicious are wrong anyway, as I noted.

regards, tom lane



stmt-list-rewrite-2.patch.gz
Description: stmt-list-rewrite-2.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan_rows confusion with parallel queries

2017-01-13 Thread Robert Haas
On Wed, Jan 11, 2017 at 4:05 PM, Robert Haas  wrote:
>> While investigating why Rushabh Lathia's Gather Merge patch sometimes
>> fails to pick a Gather Merge plan even when it really ought to do so,
>> I ran smack into this problem.  I discovered that this is more than a
>> cosmetic issue.  The costing itself is actually badly broken.
>>
>> The reason why this is happening is that final_cost_nestloop(),
>> final_cost_hashjoin(), and final_cost_mergejoin() don't care a whit
>> about whether the path they are generating is partial.  They apply the
>> row estimate for the joinrel itself to every such path generated for
>> the join, except for parameterized paths which are a special case.  I
>> think this generally has the effect of discouraging parallel joins,
>> because the inflated row count also inflates the join cost.  I think
>> the right thing to do is probably to scale the row count estimate for
>> the joinrel's partial paths by the leader_contribution value computed
>> in cost_seqscan.
>>
>> Despite my general hatred of back-patching things that cause plan
>> changes, I'm inclined to think the fix for this should be back-patched
>> to 9.6, because this is really a brown-paper-bag bug.  If the
>> consensus is otherwise I will of course defer to that consensus.
>
> And here is a patch which seems to fix the problem.

Since nobody seems to have any comment here, I've committed and
back-patched this to 9.6.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Serge Rielau

> On Jan 13, 2017, at 10:23 AM, Pavel Stehule  wrote:
> 
> I have not clean feeling from this - I am pretty sure so I am afraid 
> schizophrenic  between MODULES, SCHEMAS. Nested schemas increase complexity 
> of searching complexity and breaks a logic database.schema.object
Yes my proposal to nest schemata is “radical” and this community is not falling 
into that camp.
But there is nothing holy about database.schema.object.attribute
.
> 
> Currently almost all in PostgreSQL PL design is primitive, but that means 
> pretty simple too.
We are having > 30,000 functions with a total of millions of lines of code.

You are describing a self fulfilling prophecy here.
As long as the community codebase only caters to its existing users you will 
not see a change in the usage pattern of the base.

> It is hard to see a advantages of this proposal. 

At least I tried :-)

Serge





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Pavel Stehule
2017-01-13 18:35 GMT+01:00 Serge Rielau :

>
>> * A design that can fit in with PostgreSQL
>> * Solid benefits beyond "makes life easier for Oracle users" to
>> justify each feature/change
>> * Funding/time to make it happen
>>
>> So far, I haven't seen anyone with one of those, let alone all three.
>>
> OK, I’ll bite…
>
> * In SFDC’s extension of PostgreSQL we nest namespaces.
>   This was done before my time here, but its very stable. It's easy to
> keep merged and not that much code.
>   To make the special semantics of these nested namespaces evident however
> we leaned on the SQL/PSM standard and call them MODULE’s.
>   Unlike the standard our MODULEs share the namespace (no pun intended)
> with regular schemata which seems practical and limits confusion when
> referencing
>   a module without schema qualification.
>
>   We did extend upon the standard with ALTER MODULE .. ADD [FUNCTION |
> TYPE | …] syntax.
>   Just like few users create a new schema with tables in one statement,
> no-one actually creates a module with content in one statement (unless, as
> in Oracle they have to).
>   This was done before my time as well, but parallels what we implemented
> did in DB2 for the reasons described earlier in this thread.
>   You want to be able to modify members of a module separately.
>
>   Starting with a blank slate I do wonder whether simply allowing nesting
> of namespaces would be sufficient to achieve the vast majority of the goal.
>   I.e. CREATE SCHEMA .
>   The rest… follows trivially :-)
>
> * Benefits:
>   a) The files on my computer are organized in directories that have more
> than one level of nesting.
>I simply can’t imagine having thousands or tens of thousands of
> objects lying around and only one coarse way of subdividing them.
>   This is compounded by the desire you version. I want to the same
> names for objects across multiple concurrently present versions of the
> schema.
>If I consume the schema for the version the entire schema for a
> version becomes a flat jumple.
>   b) Access control
>   By putting things that belong together actually together in an
> explicit way I can achieve scoping without having to resort to permissions.
>   I can simply postulate that all objects in a module are private
> unless they are published.
>   Access control happens at the module level.
>  This is no different than library management on your OS.
>  You don’t chmod the individual entry points!
>  c) Scoping
>  Similar to the above, but more related to search path.
>  Within a module I can be assured that any unqualified references will
> first resolve within the module.
>  No mucking with the search path by anyone will cause me to execute
> the wrong function, resolve to the wrong type etc.
>
>   Simply put: As long as we agree that users want to implement substantial
> server side logic the conclusion that standard programming
>   abstractions such as classes and member functions are a boon seems to be
> obvious.
>
>   Note that I have been careful not to tie modules too strongly to
> specific types. Conceptually I see nothing from with a module, table, view,
> etc.
>   It’s just a bit more “far out” since there is AFAIK no precedence.
>
> * IFF our existing efforts (fast defaults and executor runtime
> improvements) to work with the community are successful I would happily
> lobby
>   to at least port our module code to the community codebase. We can take
> it from there.
>

I have not clean feeling from this - I am pretty sure so I am afraid
schizophrenic  between MODULES, SCHEMAS. Nested schemas increase complexity
of searching complexity and breaks a logic database.schema.object.

Currently almost all in PostgreSQL PL design is primitive, but that means
pretty simple too.

It is hard to see a advantages of this proposal.

Regards

Pavel


> Cheers
> Serge Rielau
> Salesforce.com 
>
>
>
>


Re: [HACKERS] UNDO and in-place update

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 5:57 AM, Amit Kapila  wrote:
> Sure, we can do that way and I agree that it is worth considering.
> Few cases where it can be really costly is when undo chain overflows
> to multiple pages and those pages doesn't exist in cache.  I think the
> reason why it is probable is that undo chain is not maintained for row
> update, rather it is maintained for a page level changes for each
> transaction.  So, what this means is that if we have to traverse the
> chain for record X, then I think it is quite possible that during
> chain traversal, we will encounter undo of record Y, ofcourse we can
> identify and ignore it, but certainly the chain can be much longer.
> Another kind of cost could be construction of actual record from undo
> record, for example if we consider to undo log only changed parts of
> Update as we do for WAL, then we have to incur some record
> construction cost as well. I think what can make it really costly is
> repeating this work.  Even, if we consider the above as a worse case,
> I am not sure whether it will be cheap for short transactions like
> pgbench as there could be cases where concurrent updates/reads needs
> to traverse undo chains.
>
> Having said that, if you don't feel strongly about caching the
> information in such a way that it can be reused by different sessions,
> then we can initially implement the system as outlined by you, then we
> can extend it based on the performance characteristics.

Sounds fine.  Also, one idea I've been considering is the idea of
allowing each shared buffer to have some associated "scratch space"
which would act as a cache that lasts as long as that page doesn't get
evicted.  I think that might have some applications for hash indexes
as well.  Of course deciding how much scratch space to allocate and
whether you wouldn't be better off using that memory some other way is
tricky, but it's a thought.

>> I don't see a reason to do that.  I think one of the advantages of
>> storing the UNDO pages in the same pool as shared_buffers is that the
>> amount of UNDO that we store in there can float up and down according
>> to demand.  If the heap pages are being accessed more frequently than
>> the UNDO pages then let the UNDO pages get evicted.  If the reverse is
>> the case, that's OK too.
>
> Okay, not a problem. I was thinking to priortise UNDO buffers, so that
> they can only be evicted during checkpoint.  However, I think there
> might not be much value in doing so.

Seems like we can leave the decision to later.  If that turns out to
be the thing that is hurting performance, we can fix it then.

>>> Vacuum can delete or make the undo file reusable when the
>>> corresponding transaction precedes RecentGlobalXmin.
>>
>> If the associated transaction commits, yes.  If it aborts, then we can
>> recycle it once undo is complete.  But I think this should be the job
>> of a dedicated process which also executes undo actions; I don't think
>> it should be associated with vacuum.
>
> Agreed that it will be good to keep a separate undo process.  However,
> I am not exactly clear about your point of executing undo actions in
> background. Do you mean to say that the main backend will mark
> transaction as aborted in clog, but undo actions will be done in
> backend or you have something else in mind?

I think it's not good to have the main backend perform the UNDO
actions because they might ERROR.  That's not something we can support
when we're already in the abort path.  If it happens in the background
process, it can catch the ERROR and return to the toplevel and retry
after a pause or do whatever needs to be done.  For example, imagine
an I/O error.

> Okay, it seems we can deduce it from trasnction status. If the
> transaction is aborted, then we know undo log is invalid.  If it is
> in-progress, then there will be a valid undo log. If it is committed,
> and all-visble (precedes RecentGlobalXmin), then the undo log will be
> invalid.

For MVCC purposes, that might work, but for other purposes, maybe not.
We actually need to replay the undo log on abort before removing it.

> By create, it seems you are thinking about having undo files as some
> in-memory thing (space reserved in memory) and if required, then only
> create the file?

Yes.

> If so, then the point we have discussed above for
> giving equal priority to undo pages in shared_buffers as heap pages is
> relevant, because backend evicting undo page could be costly as
> compare to heap page.

It's possible, but I don't think that's a key part of the design so
I'd say skip it for now and we'll deal with it if it becomes an issue.

> Won't it be efficient if we can do the same mapping between undo
> byte-offset and undo file as we do for WAL ((WALWrite offset to file).
> We can keep file size bigger than what we have for each WAL Segment?
> Do we need something more?

Well you need something more if you want to combine multiple undo logs
in a single 

Re: [HACKERS] Parallel Index Scans

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 9:28 AM, Anastasia Lubennikova
 wrote:
> I didn't find any case of noticeable performance degradation,
> so set status to "Ready for committer".

The very first hunk of doc changes looks like it makes the whitespace
totally wrong - surely it can't be right to have 0-space indents in C
code.

+The index_size parameter indicate the size of generic parallel

indicate -> indicates
size of generic -> size of the generic

+   index-type-specific parallel information which will be stored immediatedly

Typo.

+   Initialize the parallel index scan state.  It will be used to initialize
+   index-type-specific parallel information which will be stored immediatedly
+   after generic parallel information required for parallel index scans.  The
+   required state information will be set in target.
+  
+
+   
+ The aminitparallelscan and
amestimateparallelscan
+ functions need only be provided if the access method supports
parallel
+ index scans.  If it doesn't, the aminitparallelscan and
+ amestimateparallelscan fields in its
IndexAmRoutine
+ struct must be set to NULL.
+   

Inconsistent indentation.   seems like a strange choice of tag.

+/* amestimateparallelscan is optional; assume no-op if not
provided by AM */

The fact that amestimateparallelscan is optional even when parallel
index scans are supported is undocumented.  Similarly for the other
functions, which also seem to be optional but not documented as such.
The code and documentation need to match.

+void   *amtarget = (char *) ((void *) target) + offset;

Passing an unaligned pointer to the AM sounds like a recipe for
crashes on obscure platforms that can't tolerate alignment violations,
and possibly bad performance on others.  I'd arrange MAXALIGN the size
of the generic structure in index_parallelscan_estimate and
index_parallelscan_initialize.  Also, why pass the size of the generic
structure to the AM-specific estimate routine, anyway?  It can't
legally return a smaller value, and we can add_size() just as well as
the AM-specific code.  Wouldn't it make more sense for the AM-specific
code to return the amount of space that is needed for AM-specific
stuff, and let the generic code deal with the generic stuff?

+ *status - True indicates that the block number returned is valid and scan
+ * is continued or block number is invalid and scan has just begun
+ * or block number is P_NONE and scan is finished.  False indicates
+ * that we have reached the end of scan for current
scankeys and for
+ * that we return block number as P_NONE.

It is hard to parse a sentence with that many "and" and "or" clauses,
especially since English, unlike C, does not have strict operator
precedence rules. Perhaps you could reword to make it more clear.
Also, does that survive pgindent with that indentation?

+BTParallelScanDesc btscan = (BTParallelScanDesc) OffsetToPointer(
+(void *) scan->parallel_scan,
+ scan->parallel_scan->ps_offset);

You could avoid these uncomfortable line breaks by declaring the
variable on one line and the initializing it on a separate line.

+SpinLockAcquire(>ps_mutex);
+pageStatus = btscan->ps_pageStatus;
+if (so->arrayKeyCount < btscan->ps_arrayKeyCount)
+*status = false;
+else if (pageStatus == BTPARALLEL_DONE)
+*status = false;
+else if (pageStatus != BTPARALLEL_ADVANCING)
+{
+btscan->ps_pageStatus = BTPARALLEL_ADVANCING;
+nextPage = btscan->ps_nextPage;
+exit_loop = true;
+}
+SpinLockRelease(>ps_mutex);

IMHO, this needs comments.

+ * It updates the value of next page that allows parallel scan to move forward
+ * or backward depending on scan direction.  It wakes up one of the sleeping
+ * workers.

This construction is commonly used in India but sounds awkward to
other English-speakers, or at least to me.  You can either drop the
word "it" and just start with the verb "Updates the value of ..." or
you can replace the first instance of "It" with "This function".
Although actually, I think this whole comment needs rewriting.  Maybe
something like "Save information about scan position and wake up next
worker to continue scan."

+ * This must be called when there are no pages left to scan. Notify end of
+ * parallel scan to all the other workers associated with this scan.

Suggest: When there are no pages left to scan, this function should be
called to notify other workers.  Otherwise, they might wait forever
for the scan to advance to the next page.

+if (status == false)

if (!status) is usually preferred for bools.  (Multiple instances.)

+#define BTPARALLEL_NOT_INITIALIZED 0x01
+#define BTPARALLEL_ADVANCING 0x02
+#define BTPARALLEL_DONE 0x03
+#define BTPARALLEL_IDLE 0x04

Let's 

Re: [HACKERS] Packages: Again

2017-01-13 Thread Serge Rielau
> 
> * A design that can fit in with PostgreSQL
> * Solid benefits beyond "makes life easier for Oracle users" to
> justify each feature/change
> * Funding/time to make it happen
> 
> So far, I haven't seen anyone with one of those, let alone all three.
OK, I’ll bite…

* In SFDC’s extension of PostgreSQL we nest namespaces.
  This was done before my time here, but its very stable. It's easy to keep 
merged and not that much code.
  To make the special semantics of these nested namespaces evident however we 
leaned on the SQL/PSM standard and call them MODULE’s.
  Unlike the standard our MODULEs share the namespace (no pun intended) with 
regular schemata which seems practical and limits confusion when referencing 
  a module without schema qualification.
  
  We did extend upon the standard with ALTER MODULE .. ADD [FUNCTION | TYPE | 
…] syntax.
  Just like few users create a new schema with tables in one statement, no-one 
actually creates a module with content in one statement (unless, as in Oracle 
they have to).
  This was done before my time as well, but parallels what we implemented did 
in DB2 for the reasons described earlier in this thread.
  You want to be able to modify members of a module separately.

  Starting with a blank slate I do wonder whether simply allowing nesting of 
namespaces would be sufficient to achieve the vast majority of the goal.
  I.e. CREATE SCHEMA .
  The rest… follows trivially :-)

* Benefits:
  a) The files on my computer are organized in directories that have more than 
one level of nesting.
   I simply can’t imagine having thousands or tens of thousands of objects 
lying around and only one coarse way of subdividing them.
  This is compounded by the desire you version. I want to the same names 
for objects across multiple concurrently present versions of the schema.
   If I consume the schema for the version the entire schema for a version 
becomes a flat jumple.
  b) Access control
  By putting things that belong together actually together in an explicit 
way I can achieve scoping without having to resort to permissions.
  I can simply postulate that all objects in a module are private unless 
they are published.
  Access control happens at the module level.
 This is no different than library management on your OS.
 You don’t chmod the individual entry points!
 c) Scoping
 Similar to the above, but more related to search path.
 Within a module I can be assured that any unqualified references will 
first resolve within the module.
 No mucking with the search path by anyone will cause me to execute the 
wrong function, resolve to the wrong type etc.  

  Simply put: As long as we agree that users want to implement substantial 
server side logic the conclusion that standard programming 
  abstractions such as classes and member functions are a boon seems to be 
obvious.

  Note that I have been careful not to tie modules too strongly to specific 
types. Conceptually I see nothing from with a module, table, view, etc.
  It’s just a bit more “far out” since there is AFAIK no precedence. 

* IFF our existing efforts (fast defaults and executor runtime improvements) to 
work with the community are successful I would happily lobby 
  to at least port our module code to the community codebase. We can take it 
from there.

Cheers
Serge Rielau
Salesforce.com


 

Re: [HACKERS] Adding type info etc for inheritance errmsg: "child table is missing column ..."

2017-01-13 Thread Alvaro Herrera
Tom Lane wrote:

> AFAICS, what Ryan is after would be better served by a separate tool that
> would produce a "diff" of the two tables' schemas.  Even if we were
> willing to overload this error message with everything we know about the
> parent column, do you really want to fix discrepancies one column at a
> time?  And what about properties that can't be uniquely associated with a
> single column, such as constraints covering multiple columns?
> 
> Also, I have a feeling that a suitable tool is already out there.  A
> moment's digging in the list archives found this thread with links to
> several candidates:
> 
> https://www.postgresql.org/message-id/flat/561D27E7.5010906%40trustport.com
> 
> and I'm pretty sure there have been other such discussions.

I remember looking some time ago, and most of all the possible
candidates were either abandoned or terribly cumbersome to use.  I ran
across Euler Taveira in a conference sometime later and he told me he
had the idea of working on writing such a tool.  I was pleased to see
his announcement recently:
https://www.postgresql.org/message-id/6c9d4a85-55b5-81cc-6f09-8f26a6da2...@timbira.com.br
I admit I have not looked at his code, but he had some very good ideas
for the complex cases.  I think it's worth checking out what he has
done.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Fabien COELHO



One thing that I'm not quite satisfied with is the business with
non-top-level RawStmt nodes in some utility statements. 
a wart from gram.y's perspective, and it's mostly a wart from analyze.c's

perspective as well --- the parse analyze routines mostly just throw away
the non-top-level RawStmt.

The original reason for doing it was that DoCopy needs to call
pg_analyze_and_rewrite() on the sub-statement, and I wanted
pg_analyze_and_rewrite's argument to be declared as RawStmt,


My 0,02€, feel free to ignore them:

Personnaly when I had started doing a version I had decided to only change 
the type at top level, and then I made a few functions being resilient 
about having a RawStmt (that I had called ParsedStmt in my version) or

a direct statement, rather than change the type, I had kept Node*.

Now I see the benefit of changing the type, because the compiler will say 
if there is an issue, and their is no hidden level changes.



So I'm now thinking that it might be better if the grammar produced
RawStmt only at top level, and anybody who calls pg_analyze_and_rewrite
on sub-sections of a utility statement has to cons up a RawStmt to put
at the top of the sub-query.


Why not. The lazy programmer I am notices that there seems to be 6 
instances, this is not too bad, some of which are already dealt with. The 
RawStmt may not need to be allocated dynamically, a stack instance could 
be enough.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Robert Haas
On Thu, Jan 12, 2017 at 1:12 PM, Tom Lane  wrote:
> Vladimir Rusinov  writes:
>> On Thu, Jan 12, 2017 at 4:57 PM, Euler Taveira  wrote:
>>> As Robert suggested in the other email: extension to create old names.
>
>> I don't follow the reasoning for the extension. It seem to have downsides
>> of both alternatives combined: we still break people's code, and we add
>> even more maintenance burden than just keeping aliases.
>
> Yeah, I'm not terribly for the extension idea.  Robert cited the precedent
> of contrib/tsearch2, but I think the history of that is a good argument
> against this: tsearch2 is still there 9 years later and there's no
> indication that we'll ever get rid of it.  We can let things rot
> indefinitely in core too.  If we do ever agree to get rid of the aliases,
> stripping them out of pg_proc.h will not be any harder than removing
> a contrib directory would be.

>From a technical point of view, that is pretty much true.  But from a
user perspective, I don't think it is.  If the old names live in an
extension, then they will not be there by default.  Tools and scripts
will break.  If that's a problem for a particular user, they an
install the extension to unbreak those things, but tool authors will
probably be motivated to upgrade their tools so that the extension is
no longer needed.  Therefore, if we eventually drop the extension,
there will probably be few users relying on it at that point.  On the
other hand, if we install aliases that are present in every install -
and that are not even optionally removable - a lot of people are not
going to bother using the new names at all.  They'll just keep on
using the old ones so that their tools work with both old and new
versions, and if we eventually rip the stuff out of pg_proc.h it will
cause almost as much howling as if we'd done it right at the
beginning.

In other words, adding an extension is the only option that I see
which will give tool authors a strong incentive to support the new
names without making it a hard requirement on day one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-13 Thread Kevin Grittner
On Fri, Jan 13, 2017 at 7:09 AM, Michael Paquier
 wrote:

> There is no real harm in including current_logfiles in base
> backups, but that's really in the same bag as postmaster.opts or
> postmaster.pid, particularly if the log file name has a
> timestamp.

I'm going to dispute that -- if postmaster.opts and postmaster.pid
are present when you restore, it takes away a level of insurance
against restoring a corrupted image of the database without knowing
it.  In particular, if the backup_label file is deleted (which
happens with alarmingly frequency), the startup code may think it
is dealing with a cluster that crashed rather than with a restore
of a backup.  This often leads to corruption (anything from
"database can't start" to subtle index corruption that isn't
noticed for months).  The presence of log files from the time of
the backup do not present a similar hazard.

So while I agree that there is no harm in including
current_logfiles in base backups, I object to the comparisons to
the more dangerous files.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-01-13 Thread Daniel Verite
Peter Eisentraut wrote:

> So in order to correctly answer the question, is the sequence about to
> run out, you need to look not only at
> the sequence but also any columns it is associated with.  check_postgres
> figures this out, but it's complicated and slow, and not easy to do
> manually.

It strikes me that this is a compelling argument for setting a sensible
MAXVALUE when creating a sequence for a SERIAL, rather than binding
the sequence to a datatype.

In existing releases the SERIAL code sets the maxvalue to 2^63 even
though it knows that the column is limited to 2^31.
It looks like setting it to 2^31 would be enough for the sake of
monitoring.

More generally, what is of interest for the monitoring is how close 
the sequence's last_value is from its max_value, independently of an
underlying type.

2^{15,31,63} are specific cases of particular interest, but there's
no reason to only check for these limits when you can do it
for every possible limit.

For instance if a user has a business need to limit an ID to 1 billion,
they should alter the sequence to a MAXVALUE of 1 billion, and be
interested in how close they are from that, not from 2^31. 


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

2017-01-13 Thread Dilip Kumar
On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar  wrote:
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_track_sql)
> + return
>
> Callers of pgstat_count_sqlstmt are always ensuring that
> pgstat_track_sql is true, seems it's redundant here.

I have done testing of the feature, it's mostly working as per the expectation.

I have few comments/questions.


If you execute the query with EXECUTE then commandTag will be EXECUTE
that way we will not show the actual query type, I mean all the
statements will get the common tag "EXECUTE".

You can try this.
PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
EXECUTE fooplan(1);

--

+ /* Destroy the SQL statement counter stats HashTable */
+ hash_destroy(pgstat_sql);
+
+ /* Create SQL statement counter Stats HashTable */

I think in the patch we have used mixed of "Stats HashTable" and
"stats HashTable", I think better
to be consistent throughout the patch. Check other similar instances.



@@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)

  PortalDrop(portal, false);

+ /*
+ * Count SQL statement for pg_stat_sql view
+ */
+ if (pgstat_track_sql)
+ pgstat_count_sqlstmt(commandTag);

We are only counting the successful SQL statement, Is that intentional?

--
I have noticed that pgstat_count_sqlstmt is called from
exec_simple_query and exec_execute_message. Don't we want to track the
statement executed from functions?
---

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-01-13 Thread Daniel Verite
Michael Paquier wrote:

> Hm. Is symmetry an important properly for sequences? It seems to me
> that if we map with the data types we had better use the MIN values
> instead for consistency. So the concept of this patch is rather weird
> and would introduce an exception with the rest of the system just for
> sequences.

Besides there's a related compatibility break in that,
if a sequence is created in an existing release like this:

CREATE SEQUENCE s MINVALUE -9223372036854775808;

And then it's dumped/reloaded on a backend that has
the patch applied, it fails with:

 MINVALUE (-9223372036854775808) is too large for sequence data type bigint

This value (-2^63) is legal in current releases, although it happens
to be off-by-1 compared to the default minvalue for a sequence going
downwards. Arguably it's the default that is weird.

I've started the thread at [1] to discuss whether it makes sense 
in the first place that our CREATE SEQUENCE takes -(2^63)+1 as the
default minvalue rather than -2^63, independantly of this patch.

I think the current patch transforms this oddity into an actual
issue  by making it impossible to reach the real minimum of
a sequence with regard to the type tied to it (-2^15 for smallint,
-2^31 for int, -2^63 for bigint), whereas in HEAD you can still
adjust minvalue to fix the off-by-1 against the bigint range, if you
happen to care about it, the only problem being that you first
need to figure this out by yourself.


[1]
https://www.postgresql.org/message-id/4865a75e-f490-4e9b-b8e7-3d78694c3...@manitou-mail.org


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2017-01-13 Thread Fabien COELHO



Could it be related to transformations operated when the file is downloaded
and saved, because it is a text file?


I think this is delaying the patch unnecessarily, I have attached a
version, please see if you can apply it successfully, we can proceed
with that safely then...


This is the same file I sent:

 sh> sha1sum pgbench-continuation-4.patch pgbench-continuation-3.patch
 97fe805a89707565210699694467f9256ca02dab  pgbench-continuation-4.patch
 97fe805a89707565210699694467f9256ca02dab  pgbench-continuation-3.patch

The difference is that your version is mime-typed with a generic

Application/OCTET-STREAM

While the one I sent was mime-typed as

text/x-diff

as defined by the system in /etc/mime.types on my xenial laptop.

My guess is that with the later your mail client changes the file when 
saving it, because it sees "text".


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Stephen Frost
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 1/12/17 2:22 PM, Stephen Frost wrote:
> > The point I was making above is that the only reason to not make such
> > changes is if they really are entirely arbitrary, but I don't think
> > we'd even be having this discussion if that was the case or that we'd
> > be split about the question.  We already moved forward with the real
> > change here- pg_xlog -> pg_wal, it really astounds me that we're having
> > to argue about what is primairly just clean-up from that change, at
> > least in my view.
> 
> I don't agree with the reasoning.  The change of the directory name was
> because some people erroneously delete the directory.  There is little
> to no risk that people accidentally delete built-in functions.  It was
> not agreed that because we rename the directory that all other uses of
> "xlog" have to be removed as well.  All those subsequent changes need to
> stand on their own.

The subsequent discussion about what to call the directory lead to a
pretty clear winner that a better name is "wal", which fits in pretty
nicely with things like "wal_level", "wal_sync_method", "wal_log_hints",
section 19.5 of our documentation.  Try finding 'xlog' or even
'transaction log' anywhere in our sample postgresql.conf, in fact.

Look at our docs too, we talk about WAL a *great* deal more than we talk
about 'xlog'.  Almost all utilization of 'xlog' that I see is due to
some function or program name which was derived from the directory name,
and half of those end up, confusingly, going back and forth between
"WAL" and "transaction log" (the description of pg_receivexlog is a
great example of this).

In the end, while I don't agree that they really need to, it seems
pretty clear to me that these changes can certainly stand on their own.
The only reason they were ever called 'xlog' was because of the
directory name and without that link, there's really no reason to keep
mis-calling these functions 'xlog' *except* this claim that we don't
want to break things, even though we're changing the directory name.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 8:58 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Have there been ever discussions about having catcache entries in a
>> shared memory area? This does not sound much performance-wise, I am
>> just wondering about the concept and I cannot find references to such
>> discussions.
>
> I'm sure it's been discussed.  Offhand I remember the following issues:
>
> * A shared cache would create locking and contention overhead.
>
> * A shared cache would have a very hard size limit, at least if it's
> in SysV-style shared memory (perhaps DSM would let us relax that).
>
> * Transactions that are doing DDL have a requirement for the catcache
> to reflect changes that they've made locally but not yet committed,
> so said changes mustn't be visible globally.
>
> You could possibly get around the third point with a local catcache that's
> searched before the shared one, but tuning that to be performant sounds
> like a mess.  Also, I'm not sure how such a structure could cope with
> uncommitted deletions: delete A -> remove A from local catcache, but not
> the shared one -> search for A in local catcache -> not found -> search
> for A in shared catcache -> found -> oops.

I think the first of those concerns is the key one.  If searching the
system catalogs costs $100 and searching the private catcache costs
$1, what's the cost of searching a hypothetical shared catcache?  If
the answer is $80, it's not worth doing.  If the answer is $5, it's
probably still not worth doing.  If the answer is $1.25, then it's
probably worth investing some energy into trying to solve the other
problems you list.  For some users, the memory cost of catcache and
syscache entries multiplied by N backends are a very serious problem,
so it would be nice to have some other options.  But we do so many
syscache lookups that a shared cache won't be viable unless it's
almost as fast as a backend-private cache, or at least that's my
hunch.

I think it would be interested for somebody to build a prototype here
that ignores all the problems but the first and uses some
straightforward, relatively unoptimized locking strategy for the first
problem.  Then benchmark it.  If the results show that the idea has
legs, then we can try to figure out what a real implementation would
look like.

(One possible approach: use Thomas Munro's DHT stuff to build the shared cache.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2017-01-13 Thread Rafia Sabih
On Fri, Jan 13, 2017 at 4:57 PM, Fabien COELHO  wrote:
>
> Hello Rafia,
>
>>>   sh> git apply ~/pgbench-continuation-3.patch
>>>   # ok
>>
>>
>> It still gives me whitespace errors with git apply,
>
>
> Strange.
>
Yes, quite strange.

>> /Users/edb/Downloads/pgbench-continuation-3.patch:31: trailing whitespace.
>> continuation \\{newline}
>
>
>> Looks like an editor issue, I used awk '{ sub("\r$", ""); print }'
>> patch1 > patch2 to clean it and it was applying then.
>
>
> Doing that does not change the file for me.
>
> I see no \r in the patch file according to "od", I just see "nl" (0x0a).
>
> sha1sum: 97fe805a89707565210699694467f9256ca02dab
> pgbench-continuation-3.patch
>
> Could it be related to transformations operated when the file is downloaded
> and saved, because it is a text file?
>
I think this is delaying the patch unnecessarily, I have attached a
version, please see if you can apply it successfully, we can proceed
with that safely then...

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


pgbench-continuation-4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Stephen Frost
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 1/12/17 1:40 PM, Stephen Frost wrote:
> > I just don't buy this argument, at all.  These functions names are
> > certainly not the only things we're changing with PG10 and serious
> > monitoring/backup/administration tools are almost certainly going to
> > have quite a bit to adjust to with the new release, and that isn't news
> > to anyone who works with PG.
> 
> I in turn don't buy this argument. ;-)
> 
> I have checked a variety of WAL-related monitoring scripts,
> graphing/trending scripts, switchover/failover scripts, and the like,
> and of course they all make ample use of a variety of *xlog* functions,
> but as far as I can tell, they don't care about the pg_xlog renaming and
> would continue to work just fine if the functions were not renamed.

The point I was making was that serious montioring systems would have to
be changed and I stand by that.  Sure, there are simple scripts out
there that could, individually, continue to work fine using the old
function names (unless and until we change some behavior of those- which
I'm sure will end up happening at some point), but the argument being
presented is that we shouldn't make people have to change and the only
way to do that would be to revert the pg_xlog -> pg_wal change.

Anything beyond that and we're into arguing about how *much* we should
ask people to change, and for my 2c, that's barely even useful to
discuss because once they have to make any change at all then the
question will come up if it's "worth it" to move to the next version or
the complaint about PG being too difficult to live with will be raised.
I'm not terribly worried about such arguments because, put simply, PG10
is going to have enough awesome features that it's an easy question to
answer.

The argument being presented here is that users won't mind that we
changed the name of a critical directory for every PG system in
existence, but they're going to be upset that we change a few function
names to go along with that change.  Really?

Certainly, check_postgres is going to have to be changed to address this
and, unsurprisingly, it's already had to address a variety of major
version differences that have been introduced over the years.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_upgrade vs. pg_ctl stop -m fast

2017-01-13 Thread Tom Lane
Peter Eisentraut  writes:
> On 1/12/17 9:43 PM, Bruce Momjian wrote:
>> On Thu, Jan 12, 2017 at 10:17:52AM -0500, Tom Lane wrote:
>>> Peter Eisentraut  writes:
 I think the last line should be changed to something like
 fast ? "-m fast" : "-m smart");

>>> There is maybe room for a separate discussion about whether pg_upgrade
>>> *should* be using fast mode, but if so we could remove the "bool fast"
>>> argument from this function altogether.

>> Agreed, it should be remove.  Should I do it?

> For 9.5 and 9.6, I think we should backpatch what I suggested above, to
> minimize the behavior change.  For master we can consider removing the
> distinction and just use fast shutdown all the time, but I haven't
> checked all the possible implications of that change.

That sounds sensible to me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Peter Eisentraut
On 1/12/17 2:22 PM, Stephen Frost wrote:
> The point I was making above is that the only reason to not make such
> changes is if they really are entirely arbitrary, but I don't think
> we'd even be having this discussion if that was the case or that we'd
> be split about the question.  We already moved forward with the real
> change here- pg_xlog -> pg_wal, it really astounds me that we're having
> to argue about what is primairly just clean-up from that change, at
> least in my view.

I don't agree with the reasoning.  The change of the directory name was
because some people erroneously delete the directory.  There is little
to no risk that people accidentally delete built-in functions.  It was
not agreed that because we rename the directory that all other uses of
"xlog" have to be removed as well.  All those subsequent changes need to
stand on their own.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Tom Lane
Craig Ringer  writes:
> Sounds like it'd be better as a separate change so as not to block this one.

Yeah, if we do that at all it should be a separate patch IMO.  The
concerns are largely different.

> I really like what you have done Tom, though I'm about to travel so I
> haven't read it in full detail. Like Fabien I would've been certain that
> it'd be rejected if I tried it, but I sure am glad you did it.

Does anyone want to do further review on this before I push it?

One thing that I'm not quite satisfied with is the business with
non-top-level RawStmt nodes in some utility statements.  That's certainly
a wart from gram.y's perspective, and it's mostly a wart from analyze.c's
perspective as well --- the parse analyze routines mostly just throw away
the non-top-level RawStmt.

The original reason for doing it was that DoCopy needs to call
pg_analyze_and_rewrite() on the sub-statement, and I wanted
pg_analyze_and_rewrite's argument to be declared as RawStmt, so CopyStmt's
sub-statement had to have a RawStmt.  And it seemed like a good idea for
the other utility statements to be consistent with that.

However, when all the dust settled, DoCopy ended up needing to construct
a RawStmt node anyway for the case where it's inventing a Query tree to
support an RLS query.  We could make it construct a RawStmt node in the
plain copy-from-query case too, and then there needn't be one in the
grammar output.

So I'm now thinking that it might be better if the grammar produced
RawStmt only at top level, and anybody who calls pg_analyze_and_rewrite
on sub-sections of a utility statement has to cons up a RawStmt to put
at the top of the sub-query.  This complicates life a little for utility
statements that call parse analysis during execution, but the principle
that RawStmt appears only at top level seems attractively simple.
We don't nest PlannedStmts either.

One thing that connects into this is why do we have some utility
statements that do parse analysis of sub-statements immediately during
their own parse analysis, while others postpone that to execution?
I've not researched it yet, but my vague memory is that EXPLAIN and
friends used to all do it in the second way, and we realized that that
was broken so we changed them to the first way.  COPY has only recently
grown the ability to have a sub-query, and I'm wondering if it's just a
failure of institutional memory that led us to do it the second way.
If we ended up requiring all these cases to work more like EXPLAIN,
they'd not be needing to construct RawStmts at execution anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Peter Eisentraut
On 1/12/17 1:50 PM, Stephen Frost wrote:
> When they're changes that are primairly going to affect
> monitoring/backup/administration tools, yes, I do think we can make just
> about arbitrary backward-incompatible changes.

I don't agree with that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-13 Thread Peter Eisentraut
On 1/12/17 1:40 PM, Stephen Frost wrote:
> I just don't buy this argument, at all.  These functions names are
> certainly not the only things we're changing with PG10 and serious
> monitoring/backup/administration tools are almost certainly going to
> have quite a bit to adjust to with the new release, and that isn't news
> to anyone who works with PG.

I in turn don't buy this argument. ;-)

I have checked a variety of WAL-related monitoring scripts,
graphing/trending scripts, switchover/failover scripts, and the like,
and of course they all make ample use of a variety of *xlog* functions,
but as far as I can tell, they don't care about the pg_xlog renaming and
would continue to work just fine if the functions were not renamed.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade vs. pg_ctl stop -m fast

2017-01-13 Thread Peter Eisentraut
On 1/12/17 9:43 PM, Bruce Momjian wrote:
> On Thu, Jan 12, 2017 at 10:17:52AM -0500, Tom Lane wrote:
>> Peter Eisentraut  writes:
>>> In 9.5, the default pg_ctl stop mode was changed from "smart" to "fast".
>>> In pg_upgrade, there is this code:
>>> ...
>>> I think the last line should be changed to something like
>>>   fast ? "-m fast" : "-m smart");
>>
>> Ugh.  Clear oversight.
>>
>> There is maybe room for a separate discussion about whether pg_upgrade
>> *should* be using fast mode, but if so we could remove the "bool fast"
>> argument from this function altogether.
> 
> Agreed, it should be remove.  Should I do it?

For 9.5 and 9.6, I think we should backpatch what I suggested above, to
minimize the behavior change.  For master we can consider removing the
distinction and just use fast shutdown all the time, but I haven't
checked all the possible implications of that change.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-01-13 Thread Anastasia Lubennikova

27.12.2016 17:33, Amit Kapila:

On Fri, Dec 23, 2016 at 6:42 PM, Anastasia Lubennikova
 wrote:

22.12.2016 07:19, Amit Kapila:

On Wed, Dec 21, 2016 at 8:46 PM, Anastasia Lubennikova
 wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi, thank you for the patch.
Results are very promising. Do you see any drawbacks of this feature or
something that requires more testing?


I think you can focus on the handling of array scan keys for testing.
In general, one of my colleagues has shown interest in testing this
patch and I think he has tested as well but never posted his findings.
I will request him to share his findings and what kind of tests he has
done, if any.



Check please code related to buffer locking and pinning once again.
I got the warning. Here are the steps to reproduce it:
Except "autovacuum = off" config is default.

pgbench -i -s 100 test
pgbench -c 10 -T 120 test

 SELECT count(aid) FROM pgbench_accounts
 WHERE aid > 1000 AND aid < 90 AND bid > 800 AND bid < 900;
WARNING:  buffer refcount leak: [8297] (rel=base/12289/16459, blockNum=2469,
flags=0x9380, refcount=1 1)
  count


The similar problem has occurred while testing "parallel index only
scan" patch and Rafia has included the fix in her patch [1] which
ideally should be included in this patch, so I have copied the fix
from her patch.  Apart from that, I observed that similar problem can
happen for backward scans, so fixed the same as well.


I confirm that this problem is solved.


But I'm trying to find the worst cases for this feature. And I suppose we
should test parallel index scans with
concurrent insertions. More parallel readers we have, higher the
concurrency.
I doubt that it can significantly decrease performance, because number of
parallel readers is not that big,


I am not sure if such a test is meaningful for this patch because
parallelism is generally used for large data reads and in such cases
there are generally not many concurrent writes.


I didn't find any case of noticeable performancedegradation,
so set status to "Ready for committer".
Thank you for this patch.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-01-13 Thread Peter Moser
> What this patch does is to add two new clauses for FROM-list items,
> NORMALIZE and ALIGN, which reshuffle a set of ranges into a new list
> that can then be aggregated more easily.  From the original message:
>
> > For NORMALIZE the tuples' ranges need to be split into all sub-ranges
> > according to all matching ranges of the second relation. For this we
> > create a subquery that first joins one relation with the range
> > boundaries of the other and then sorts the result. The executor
> > function splits the ranges in a sweep-line based manner.
> >
> > For ALIGN the tuples' ranges must be split into all intersections and
> > differences with the other relation according to the join condition.
> > For this we create a subquery that first joins the two relations and
> > then sorts the result. The executor function splits the ranges
> > accordingly in a sweep-line based manner.
>
> So there isn't really temporal query processing as such here, only some
> helpers that can make it easier.

The goal of temporal aligners and normalizers is to split ranges to allow a
reduction from temporal queries to their non-temporal counterparts.
Splitting
ranges is necessary for temporal query processing. Temporal aligners and
normalizer may then be used as building-blocks for any temporal query
construct.

> I can see how those operations can be useful, but it would help if there
> were a more formal definition to be able to check that further.

We have published two papers, that contain formal definitions and related
work
for the temporal aligner and normalizer. Please see [1] and [2].

> What I'm missing here is some references: existing implementations,
> standards, documentation, research papers, alternative ideas, rejected
> alternatives, etc.

A good overview of existing implementations in DBMSs, SQL standard, and
history
is given in [3].

> Also, the submission is missing documentation and test cases.  There are
> technical terms used in the code that I don't understand.

We added a second patch with test cases and expected results. We are now
writing the documentation in sgml-format.

> I think there are probably many interesting applications for normalizing
> or otherwise adjusting ranges.  I'd like to see an overview and
> consideration of other applications.

Please see the attached file adjustment.sql for some interesting
applications.

> Ideally, I'd like to see these things implemented as some kind of
> user-space construct, like an operator or function.  I think we'd need a
> clearer definition of what it is they do before we can evaluate that.

Can you please explain what you mean by "user-space construct" in this case.



Best regards,
Anton, Johann, Michael, Peter


[1] Anton Dignös, Michael H. Böhlen, Johann Gamper:
Temporal alignment. SIGMOD Conference 2012: 433-444
http://doi.acm.org/10.1145/2213836.2213886
[2] Anton Dignös, Michael H. Böhlen, Johann Gamper, Christian S. Jensen:
Extending the Kernel of a Relational DBMS with Comprehensive Support for
Sequenced Temporal Queries. ACM Trans. Database Syst. 41(4): 26:1-26:46
(2016)
http://doi.acm.org/10.1145/2967608
[3] https://www2.cs.arizona.edu/people/rts/sql3.html and
https://www2.cs.arizona.edu/people/rts/tsql2.html


adjustment.sql
Description: application/sql
diff --git src/test/regress/expected/temporal_primitives.out src/test/regress/expected/temporal_primitives.out
new file mode 100644
index 000..6e4cc0d
--- /dev/null
+++ src/test/regress/expected/temporal_primitives.out
@@ -0,0 +1,841 @@
+--
+-- TEMPORAL PRIMITIVES
+--
+SET datestyle TO ymd;
+CREATE COLLATION "de_DE.utf8" (LC_COLLATE = "de_DE.utf8",
+   LC_CTYPE = "de_DE.utf8" );
+CREATE TEMP TABLE tpg_table1 (a char, b char, ts int, te int);
+CREATE TEMP TABLE tpg_table2 (c int, d char, ts int, te int);
+INSERT INTO tpg_table1 VALUES
+('a','B',1,7),
+('b','B',3,9),
+('c','G',8,10);
+INSERT INTO tpg_table2 VALUES
+(1,'B',2,5),
+(2,'B',3,4),
+(3,'B',7,9);
+-- VALID TIME columns (i.e., ts and te) are no longer at the end of the
+-- targetlist.
+CREATE TEMP TABLE tpg_table3 AS
+	SELECT a, ts, te, b FROM tpg_table1;
+CREATE TEMP TABLE tpg_table4 AS
+	SELECT c, ts, d, te FROM tpg_table2;
+-- VALID TIME columns represented as range type
+CREATE TEMP TABLE tpg_table5 AS
+	SELECT int4range(ts, te) t, a, b FROM tpg_table1;
+CREATE TEMP TABLE tpg_table6 AS
+	SELECT int4range(ts, te) t, c a, d b FROM tpg_table2;
+-- VALID TIME columns as VARCHARs
+CREATE TEMP TABLE tpg_table7 (a int, ts varchar, te varchar);
+CREATE TEMP TABLE tpg_table8 (a int,
+			  ts varchar COLLATE "de_DE.utf8",
+			  te varchar COLLATE "POSIX");
+INSERT INTO tpg_table7 VALUES
+(0, 'A', 'D'),
+(1, 'C', 'X'),
+(0, 'ABC', 'BCD'),
+(0, 'xABC', 'xBCD'),
+(0, 'BAA', 'BBB');
+INSERT INTO tpg_table8 VALUES
+(0, 'A', 'D'),
+(1, 'C', 'X');
+-- Tables to check different data types, and corner cases
+CREATE TEMP TABLE tpg_table9 (a int, ts timestamp, te timestamp);
+CREATE TEMP TABLE tpg_table10 

Re: WG: [HACKERS] Packages: Again

2017-01-13 Thread Craig Ringer
On 13 Jan. 2017 19:16, "Thomas Kellerer" <


Which is a bit cumbersome given Oracle's limit on 30 characters for
identifiers - but it still increases maintainability. And one of the
advantages given for packages was the increase in namespace availability
which is much easier with Postgres anyway.


I was wondering where the namespace thing came from. Sure,
packagename_funcname I'd cumbersome but it's not exactly hard and we've
been doing it in C since forever.

I'd assumed it was an issue in the opposite direction. PG identifiers being
too short. But it sounds like instead it's people not realising they can do
this.


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Craig Ringer
On 13 Jan. 2017 16:35, "Kyotaro HORIGUCHI" 
wrote:

Hello,

At Thu, 12 Jan 2017 20:08:54 +0100 (CET), Fabien COELHO 
wrote in 
>
> About having a pointer to the initial string from RawStmt, Query &
> PlannedStmt:
>
> > I remembered one reason why we haven't done this: it's unclear how
> > we'd handle copying if we do it. If, say, Query contains a "char *"
> > pointer then you'd expect copyObject() to pstrdup that string, [...,
> > So] We'd need to work out a way of managing multiple Queries carrying
> > references to the same source string, and it's not clear how to do
> > that reasonably.
>
> For me it would be shared, but then it may break some memory
> management hypothesis downstream.

+1 to they have a pointer to the shared query string. But doing
that without some measure like reference counting seems
difficult..



Sounds like it'd be better as a separate change so as not to block this one.

I really like what you have done Tom, though I'm about to travel so I
haven't read it in full detail. Like Fabien I would've been certain that
it'd be rejected if I tried it, but I sure am glad you did it.


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> I found an inconsistency (in style, not function:) between
> copyfuncs and equalfuncs. The patch handles the three members
> utilityStmt, stmt_location and stmt_len of Query at once and
> copyfuncs seems to be edited to follow the rule, but in
> equalfuncs the position of utilityStmt is not moved.

I think you're looking at struct Query's instance of utilityStmt.
I didn't move that one.  (Maybe I should've, but it didn't
look quite as randomly dropped into the middle of the struct
as the PlannedStmt one did.  Just a judgment call in either case.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Tom Lane
Michael Paquier  writes:
> Have there been ever discussions about having catcache entries in a
> shared memory area? This does not sound much performance-wise, I am
> just wondering about the concept and I cannot find references to such
> discussions.

I'm sure it's been discussed.  Offhand I remember the following issues:

* A shared cache would create locking and contention overhead.

* A shared cache would have a very hard size limit, at least if it's
in SysV-style shared memory (perhaps DSM would let us relax that).

* Transactions that are doing DDL have a requirement for the catcache
to reflect changes that they've made locally but not yet committed,
so said changes mustn't be visible globally.

You could possibly get around the third point with a local catcache that's
searched before the shared one, but tuning that to be performant sounds
like a mess.  Also, I'm not sure how such a structure could cope with
uncommitted deletions: delete A -> remove A from local catcache, but not
the shared one -> search for A in local catcache -> not found -> search
for A in shared catcache -> found -> oops.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Too many autovacuum workers spawned during forced auto-vacuum

2017-01-13 Thread Alvaro Herrera
Amit Khandekar wrote:
> In a server where autovacuum is disabled and its databases reach
> autovacuum_freeze_max_age limit, an autovacuum is forced to prevent
> xid wraparound issues. At this stage, when the server is loaded with a
> lot of DML operations, an exceedingly high number of autovacuum
> workers keep on getting spawned, and these do not do anything, and
> then quit.

I think this is the same problem as reported in
https://www.postgresql.org/message-id/CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=ypy...@mail.gmail.com

> === Fix ===
[...]
> Instead, the attached patch (prevent_useless_vacuums.patch) prevents
> the repeated cycle by noting that there's no point in doing whatever
> vac_update_datfrozenxid() does, if we didn't find anything to vacuum
> and there's already another worker vacuuming the same database. Note
> that it uses wi_tableoid field to check concurrency. It does not use
> wi_dboid field to check for already-processing worker, because using
> this field might cause each of the workers to think that there is some
> other worker vacuuming, and eventually no one vacuums. We have to be
> certain that the other worker has already taken a table to vacuum.

Hmm, it seems reasonable to skip the end action if we didn't do any
cleanup after all.  This would normally give enough time between vacuum
attempts for the first worker to make further progress and avoid causing
a storm.  I'm not really sure that it fixes the problem completely, but
perhaps it's enough.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing subplans

2017-01-13 Thread Tom Lane
Dilip Kumar  writes:
> ERROR:  did not find '}' at end of input node at character 762

I've not looked at the patches, but just seeing this error message,
this looks like somebody's fat-fingered the correspondence between
outfuncs.c and readfuncs.c processing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing subplans

2017-01-13 Thread Dilip Kumar
On Thu, Jan 12, 2017 at 7:56 PM, Amit Kapila  wrote:
> Okay, done that way.  I have fixed the review comments raised by Dilip
> as well and added the test case in attached patch.

I have tested with pq_pushdown_subplan_v2.patch +
pq_pushdown_correl_subplan_v1.patch

sqlsmith. is reporting below error, I have attached the query to
reproduce the issue. This issue is only coming when both the patch is
applied, only with pq_pushdown_subplan_v2.patch there is no problem.
So I think the problem is because of
pq_pushdown_correl_subplan_v1.patch.

ERROR:  did not find '}' at end of input node at character 762

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
select  
  subq_2.c1 as c0, 
  25 as c1
from 
  (select  
pg_catalog.random() as c0, 
31 as c1, 
ref_3.prsend as c2, 
subq_1.c2 as c3
  from 
(select  
(select page from pg_catalog.pg_locks limit 1 offset 10)
 as c0, 
9 as c1, 
sample_1.opfname as c2, 
sample_1.opfnamespace as c3
  from 
pg_catalog.pg_opfamily as sample_1 tablesample 
bernoulli (5.4) ,
lateral (select  
  57 as c0, 
  7 as c1
from 
  pg_catalog.pg_amop as sample_2 tablesample system 
(8.7) 
where 54 > (select character_maximum_length from 
information_schema.element_types limit 1 offset 28)

limit 8) as subq_0
  where (((pg_catalog.pg_postmaster_start_time() < 
pg_catalog.pg_stat_get_snapshot_timestamp()) 
and (subq_0.c0 is not NULL)) 
  and (pg_catalog.pg_backend_pid() <> 30)) 
or (subq_0.c0 >= (select active_pid from 
pg_catalog.pg_replication_slots limit 1 offset 9)
)
  limit 167) as subq_1
  inner join pg_catalog.pg_ts_parser as ref_3
  on (subq_1.c2 = ref_3.prsname )
  where subq_1.c1 <= subq_1.c0
  limit 56) as subq_2,
  lateral (select  
subq_2.c0 as c0
  from 
pg_catalog.pg_statistic as sample_3 tablesample bernoulli (2.3) 
  where (EXISTS (
  select  
  pg_catalog.pg_trigger_depth() as c0, 
  sample_4.comments as c1
from 
  information_schema.sql_sizing_profiles as sample_4 
tablesample bernoulli (9.9) 
where (pg_catalog.statement_timestamp() is not NULL) 
  and (((select character_maximum_length from 
information_schema.domains limit 1 offset 40)
 < 7) 
or ((select ordinal_position from 
information_schema.parameters limit 1 offset 21)
 = sample_4.sizing_id) 
or (sample_4.sizing_id <> 
sample_4.required_value)) 
  or (sample_4.sizing_id <= 
pg_catalog.pg_stat_get_buf_alloc())) 
and 
((pg_catalog.pg_stat_get_bgwriter_buf_written_checkpoints() <> 
pg_catalog.gin_cmp_tslexeme(
  cast(pg_catalog.timeofday() as text),
  cast(null as text))) 
  and (sample_4.profile_id is NULL))) 
  or ((select ordinal_position from 
information_schema.attributes limit 1 offset 31)
 < sample_4.sizing_id))) 
  and (sample_4.sizing_id <= sample_4.sizing_id)) 
or (sample_4.sizing_id <> 40)) 
  and ((sample_4.required_value <> sample_4.sizing_id) 
and (false))) 
or (((pg_catalog.int8range_canonical(
cast(null as int8range)) is not NULL) 
and (sample_4.required_value >= 58)) 
  and (((select objsubid from pg_catalog.pg_seclabel 
limit 1 offset 1)
 is NULL) 
and (((select typtypmod from pg_catalog.pg_type 
limit 1 offset 32)
 <= (select character_maximum_length from 
information_schema.element_types limit 1 offset 33)
) 
  or (13 <= 9)
limit 67)) 
and ((27 <= (select sourceline from pg_catalog.pg_settings 
limit 1 offset 6)
) 
  or (sample_3.staattnum <= subq_2.c1))
  limit 52) as subq_3
where true;

-- 
Sent via pgsql-hackers mailing list 

[HACKERS] IF NOT EXISTS option for CREATE SERVER and CREATE USER MAPPING statements

2017-01-13 Thread Anastasia Lubennikova
I implemented IF NOT EXISTS option for CREATE SERVER and CREATE USER 
MAPPING statements

for one of our customers.
I think other users can also find it useful for scripting and automated 
tasks.
The patches themselves are small and simple. Documentation is updated as 
well.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit cf20fa8c9f29310d8f67e5b198a9eb908d903431
Author: Anastasia 
Date:   Fri Jan 13 16:22:01 2017 +0300

Add [IF NOT EXISTS] option to CREATE SERVER statement

diff --git a/doc/src/sgml/ref/create_server.sgml b/doc/src/sgml/ref/create_server.sgml
index 734c6c9..6777679 100644
--- a/doc/src/sgml/ref/create_server.sgml
+++ b/doc/src/sgml/ref/create_server.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE SERVER server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ]
+CREATE SERVER [IF NOT EXISTS] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ]
 FOREIGN DATA WRAPPER fdw_name
 [ OPTIONS ( option 'value' [, ... ] ) ]
 
@@ -56,6 +56,19 @@ CREATE SERVER server_name [ TYPE '<
   Parameters
 
   
+
+  
+IF NOT EXISTS
+
+ 
+  Do not throw an error if a server with the same name already exists.
+  A notice is issued in this case.  Note that there is no guarantee that
+  the existing server is anything like the one that would have been
+  created.
+ 
+
+   
+

 server_name
 
diff --git a/src/backend/commands/foreigncmds.c b/src/backend/commands/foreigncmds.c
index 06b4bc3..0b0114c 100644
--- a/src/backend/commands/foreigncmds.c
+++ b/src/backend/commands/foreigncmds.c
@@ -883,12 +883,25 @@ CreateForeignServer(CreateForeignServerStmt *stmt)
 
 	/*
 	 * Check that there is no other foreign server by this name.
+	 * Do nothing if IF NOT EXISTS was enforced.
 	 */
 	if (GetForeignServerByName(stmt->servername, true) != NULL)
-		ereport(ERROR,
-(errcode(ERRCODE_DUPLICATE_OBJECT),
- errmsg("server \"%s\" already exists",
-		stmt->servername)));
+	{
+		if (stmt->if_not_exists)
+		{
+			ereport(NOTICE,
+	(errcode(ERRCODE_DUPLICATE_OBJECT),
+	 errmsg("foreign server \"%s\" already exists, skipping",
+			stmt->servername)));
+			heap_close(rel, RowExclusiveLock);
+			return InvalidObjectAddress;
+		}
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DUPLICATE_OBJECT),
+	 errmsg("foreign server \"%s\" already exists",
+			stmt->servername)));
+	}
 
 	/*
 	 * Check that the FDW exists and that we have USAGE on it. Also get the
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9eef550..ab4b3b1 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -4621,6 +4621,19 @@ CreateForeignServerStmt: CREATE SERVER name opt_type opt_foreign_server_version
 	n->version = $5;
 	n->fdwname = $9;
 	n->options = $10;
+	n->if_not_exists = false;
+	$$ = (Node *) n;
+}
+| CREATE SERVER IF_P NOT EXISTS name opt_type opt_foreign_server_version
+		 FOREIGN DATA_P WRAPPER name create_generic_options
+{
+	CreateForeignServerStmt *n = makeNode(CreateForeignServerStmt);
+	n->servername = $6;
+	n->servertype = $7;
+	n->version = $8;
+	n->fdwname = $12;
+	n->options = $13;
+	n->if_not_exists = true;
 	$$ = (Node *) n;
 }
 		;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 7ceaa22..8a79ec0 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2075,6 +2075,7 @@ typedef struct CreateForeignServerStmt
 	char	   *servertype;		/* optional server type */
 	char	   *version;		/* optional server version */
 	char	   *fdwname;		/* FDW name */
+	bool		if_not_exists;	/* just do nothing if it already exists? */
 	List	   *options;		/* generic options to server */
 } CreateForeignServerStmt;
 
commit 4237a23d6674da774b0772ffc953e5f499c04803
Author: Anastasia 
Date:   Fri Jan 13 16:23:53 2017 +0300

Add [IF NOT EXISTS] option to CREATE USER MAPPING statement

diff --git a/doc/src/sgml/ref/create_user_mapping.sgml b/doc/src/sgml/ref/create_user_mapping.sgml
index 44fe302..680906b 100644
--- a/doc/src/sgml/ref/create_user_mapping.sgml
+++ b/doc/src/sgml/ref/create_user_mapping.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE USER MAPPING FOR { user_name | USER | CURRENT_USER | PUBLIC }
+CREATE USER MAPPING [IF NOT EXISTS] FOR { user_name | USER | CURRENT_USER | PUBLIC }
 SERVER server_name
 [ OPTIONS ( option 'value' [ , ... ] ) ]
 
@@ -49,6 +49,18 @@ CREATE USER MAPPING FOR { user_name
  
   Parameters
 
+  
+IF NOT EXISTS
+
+ 
+  Do not throw an error if a user mapping with the same name already exists.
+  A notice is issued in this case.  Note that there is no guarantee that
+  the existing user mapping is anything like the one that would have been
+  created.
+ 
+
+   

Re: [HACKERS] [COMMITTERS] pgsql: Fix field order in struct catcache.

2017-01-13 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Somebody failed to grasp the point of having the #ifdef CATCACHE_STATS
>> fields at the end of the struct.  Put that back the way it should be,
>> and add a comment making it more explicit why it should be that way.

> Actually, the CATCACHE_STATS was added purposefully not at end of
> struct, because back then cc_bucket was a variable-length array; see
> commit 786340441706 (2002).  It only became possible to do this after
> commit 20cb18db4668 (2013).  I researched this because I vaguely
> believed this to be my fault (a66ee69ad) but that's not so.

Hah, okay.  Well, it's better now than it was.

(Actually, I wonder why we expose struct catcache to the rest of the
world at all.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix field order in struct catcache.

2017-01-13 Thread Alvaro Herrera
Tom Lane wrote:
> Fix field order in struct catcache.
> 
> Somebody failed to grasp the point of having the #ifdef CATCACHE_STATS
> fields at the end of the struct.  Put that back the way it should be,
> and add a comment making it more explicit why it should be that way.

Actually, the CATCACHE_STATS was added purposefully not at end of
struct, because back then cc_bucket was a variable-length array; see
commit 786340441706 (2002).  It only became possible to do this after
commit 20cb18db4668 (2013).  I researched this because I vaguely
believed this to be my fault (a66ee69ad) but that's not so.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing matching of boolean index columns to sort ordering

2017-01-13 Thread Tom Lane
Michael Paquier  writes:
> And actually, contrary to what is mentioned upthread, the planner is
> not able to avoid a sort phase if other data types are used, say:
> =# create table foo (a int, b int);
> CREATE TABLE
> =# create index on foo (a, b);
> CREATE INDEX
> =# explain (costs off) select * from foo where a = 1 order by b limit 10;

No, there's a difference between "not able to" and "chooses not to".
In this example case, it just thinks a bitmap scan is cheaper than
an ordered scan:

regression=# explain select * from foo where a = 1 order by b limit 10;
  QUERY PLAN
   
---
 Limit  (cost=15.10..15.13 rows=10 width=8)
   ->  Sort  (cost=15.10..15.13 rows=11 width=8)
 Sort Key: b
 ->  Bitmap Heap Scan on foo  (cost=4.24..14.91 rows=11 width=8)
   Recheck Cond: (a = 1)
   ->  Bitmap Index Scan on foo_a_b_idx  (cost=0.00..4.24 rows=11 
width=0)
 Index Cond: (a = 1)
(7 rows)

regression=# set enable_bitmapscan to 0;
SET
regression=# explain select * from foo where a = 1 order by b limit 10;
 QUERY PLAN 


 Limit  (cost=0.15..33.06 rows=10 width=8)
   ->  Index Only Scan using foo_a_b_idx on foo  (cost=0.15..36.35 rows=11 
width=8)
 Index Cond: (a = 1)
(3 rows)

The problem with the boolean-column case is it fails to recognize
that the index matches at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] how to correctly invalidate a constraint?

2017-01-13 Thread Pavel Stehule
Hi

I would to do import without RI check - so I disable RI triggers.

I would to invalidate constraint on Foreign Keys and then I would to use
ALTER TABLE VALIDATE CONSTRAINT ...

I didn't find how to invalidate constraint without direct update
pg_constraint.

Is there some clean way?

Regards

Pavel


Re: [HACKERS] Unused member root in foreign_glob_cxt

2017-01-13 Thread Tom Lane
Ashutosh Bapat  writes:
> On Thu, Jan 12, 2017 at 6:39 PM, Tom Lane  wrote:
>> I think you'd just end up putting it back at some point.  It's the only
>> means that foreign_expr_walker() has for getting at the root pointer,
>> and nearly all planner code needs that.  Seems to me it's just chance
>> that foreign_expr_walker() doesn't need it right now.

> I agree with you that most of the planner code needs that but not
> every planner function accepts root as an argument.

[ shrug... ]  The general trend in the planner is that things get more
complicated because we start accounting for new effects.  I don't have
any faith in the claim that because is_foreign_expr hasn't needed
PlannerInfo yet, it never will, because what it's required to do is
inherently spongy and complicated.  It wouldn't particularly surprise
me if someone wanted to start putting cost considerations in there,
for example.

In short, I think that removing this argument is just make-work that
we're very likely to have to undo at some point; and the further you
extend the changes, the larger a hazard for back-patching it will be.

If you can convince some other committer that this is a good idea, fine,
but I won't commit it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql - additional extra checks

2017-01-13 Thread Marko Tiikkaja
On Fri, Jan 13, 2017 at 2:46 AM, Jim Nasby  wrote:

> On 1/11/17 5:54 AM, Pavel Stehule wrote:
>
>> +too_many_rows
>> +
>> + 
>> +  When result is assigned to a variable by INTO
>> clause,
>> +  checks if query returns more than one row. In this case the
>> assignment
>> +  is not deterministic usually - and it can be signal some issues in
>> design.
>>
>
> Shouldn't this also apply to
>
> var := blah FROM some_table WHERE ...;
>
> ?
>
> AIUI that's one of the beefs the plpgsql2 project has.
>

No, not at all.  That syntax is undocumented and only works because
PL/PgSQL is a hack internally.  We don't use it, and frankly I don't think
anyone should.


.m


Re: [HACKERS] Parallel bitmap heap scan

2017-01-13 Thread Dilip Kumar
On Thu, Jan 12, 2017 at 5:47 PM, Dilip Kumar  wrote:
>> Hello Dilip,
>> I was trying to test the performance of all the parallel query related
>> patches on TPC-H queries, I found that when parallel hash [1 ]and your
>> patch are applied then Q10 and Q14 were hanged, however, without any
>> one of these patches, these queries are running fine. Following is the
>> stack trace,
>
> Thanks for reporting, I will look into this.
I had setup for TPCH scale factor 5, I could reproduce the hang
reported by you with Q10. I have fixed the issue in v10 (only 0003 is
changed other patches have no change). However, after fixing this
issue it's crashing in parallel shared hash patch and call stack is
same what you have reported on parallel shared hash thread[1]

[1] 
https://www.postgresql.org/message-id/CAOGQiiOJji9GbLwfT0vBuixdkgAsxzZjTpPxsO6BiTLD--SzYg%40mail.gmail.com

Please verify with the new patch.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


0003-parallel-bitmap-heap-scan-v10.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-13 Thread Michael Paquier
On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold  wrote:
> Le 13/01/2017 à 05:26, Michael Paquier a écrit :
>> Surely the temporary file of current_logfiles should not be included
>> in base backups (look at basebackup.c). Documentation needs to reflect
>> that as well. Also, should we have current_logfiles in a base backup?
>> I would think no.
> Done but can't find any documentation about the file exclusion, do you
> have a pointer?

protocol.sgml, in the protocol-replication part. You could search for
the paragraph that contains postmaster.opts. There is no real harm in
including current_logfiles in base backups, but that's really in the
same bag as postmaster.opts or postmaster.pid, particularly if the log
file name has a timestamp.

>> pg_current_logfile() can be run by *any* user. We had better revoke
>> its access in system_views.sql!
> Why? There is no special information returned by this function unless
> the path but it can be retrieve using show log_directory.

log_directory could be an absolute path, and we surely don't want to
let normal users have a look at it. For example, "show log_directory"
can only be seen by superusers. As Stephen Frost is for a couple of
months (years?) on a holly war path against super-user checks in
system functions, I think that revoking the access to this function is
the best thing to do. It is as well easier to restrict first and
relax, the reverse is harder to justify from a compatibility point of
view.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2017-01-13 Thread Etsuro Fujita

On 2017/01/13 0:43, Robert Haas wrote:

On Wed, Jan 11, 2017 at 2:45 AM, Etsuro Fujita
 wrote:

As I said before, that might be fine for 9.6, but I don't think it's a good
idea to search the pathlist because once we support parameterized foreign
join paths, which is on my TODOs, we would have to traverse through the
possibly-lengthy pathlist to find a local-join path, as mentioned in [3].



I'm not sure that's really going to be a problem.  The number of
possible parameterizations that need to be considered isn't usually
very big.  I bet the path list will have ten or a few tens of entries
in it, not a hundred or a thousand.  Traversing it isn't that
expensive.

That having been said, I haven't read the patches, so I'm not really
up to speed on the bigger issues here.  But surely it's more important
to get the overall design right than to worry about the cost of
walking the pathlist or worrying about the cost of an extra function
call (?!).


My biggest concern about GetExistingLocalJoinPath is that might not be 
extendable to the case of foreign-join paths with parameterization; in 
which case, fdw_outerpath for a given foreign-join path would need to 
have the same parameterization as the foreign-join path, but there might 
not be any existing paths with the same parameterization in the path 
list.  You might think we could get the fdw_outerpath by getting an 
existing path with no parameterization as in GetExistingLocalJoinPath 
and then modifying the path's param_info to match the parameterization 
of the foreign-join path.  I don't know that really works, but that 
might be inefficient.


What I have in mind to support foreign-join paths with parameterization 
for postgres_fdw like this: (1) generate parameterized paths from any 
joinable combination of the outer/inner cheapest-parameterized paths 
that have pushed down the outer/inner relation to the remote server, in 
a similar way as postgresGetForeignJoinPaths creates unparameterized 
paths, and (2) create fdw_outerpath for each parameterized path from the 
outer/inner paths used to generate the parameterized path, by 
create_nestloop_path (or, create_hashjoin_path or create_mergejoin_path 
if full join), so that the resulting fdw_outerpath has the same 
parameterization as the paramterized path.  This would probably work and 
might be more efficient.  And the patch I proposed would be easily 
extended to this, by replacing the outer/inner cheapest-total paths with 
the outer/inner cheapest-parameterized paths.  Attached is the latest 
version of the patch.


Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***
*** 1519,1540  SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t
 Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
 Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
 Remote SQL: SELECT r1."C 1", r1.c3, CASE WHEN (r1.*)::text IS NOT NULL THEN ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8) END, r2."C 1", CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) END FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR UPDATE OF r1
!->  Merge Join
   Output: t1.c1, t1.c3, t1.*, t2.c1, t2.*
!  Merge Cond: (t1.c1 = t2.c1)
!  ->  Sort
 Output: t1.c1, t1.c3, t1.*
!Sort Key: t1.c1
!->  Foreign Scan on public.ft1 t1
!  Output: t1.c1, t1.c3, t1.*
!  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" FOR UPDATE
!  ->  Sort
 Output: t2.c1, t2.*
!Sort Key: t2.c1
!->  Foreign Scan on public.ft2 t2
!  Output: t2.c1, t2.*
!  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"
! (23 rows)
  
  SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10 FOR UPDATE OF t1;
   c1  | c1  
--- 1519,1534 
 Output: t1.c1, t2.c1, t1.c3, t1.*, t2.*
 Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
 Remote SQL: SELECT r1."C 1", r1.c3, CASE WHEN (r1.*)::text IS NOT NULL THEN ROW(r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8) END, r2."C 1", CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7, r2.c8) END FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 ASC NULLS LAST, r1."C 1" ASC NULLS LAST FOR UPDATE OF r1
! 

Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2017-01-13 Thread Fabien COELHO


Hello Rafia,


  sh> git apply ~/pgbench-continuation-3.patch
  # ok


It still gives me whitespace errors with git apply,


Strange.


/Users/edb/Downloads/pgbench-continuation-3.patch:31: trailing whitespace.
continuation \\{newline}



Looks like an editor issue, I used awk '{ sub("\r$", ""); print }'
patch1 > patch2 to clean it and it was applying then.


Doing that does not change the file for me.

I see no \r in the patch file according to "od", I just see "nl" (0x0a).

sha1sum: 97fe805a89707565210699694467f9256ca02dab pgbench-continuation-3.patch

Could it be related to transformations operated when the file is 
downloaded and saved, because it is a text file?


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: WG: [HACKERS] Packages: Again

2017-01-13 Thread Thomas Kellerer
Wolfgang Wilhelm wrote
> - The more difficult a database change including rewriting of code will
> get the less likely you'll find something paying for it. In my case there
> is a list of reasons from the customer _not_ to switch from Oracle to
> PostgreSQL. Besides more obvious reasons like APEX applications on the
> list there are things like "complicated PL/SQL code e.g. ... packages..."
> (whatever complicated is). Lots of the other reasons on that list begin to
> blur because of the changes of the recent versions or the near future like
> parallelisation or working on partitions.
> Of course there are some questions about style, maintainability... But
> this would be another post.

We are a similar shop: mostly Oracle and increasingly more Postgres. 

But we essentially stopped (or are in the process of) using packages
altogether - /because/ of maintainability. If a package contains more then
just a single procedure it's impossible for two devs to work on different
procedures because the package body still needs to be a *single* source file
(which sometimes means: a single file with 10 or 20 procedures). Wherever we
have the chance we started migrating packages into standalone procedures. 

Which is a bit cumbersome given Oracle's limit on 30 characters for
identifiers - but it still increases maintainability. And one of the
advantages given for packages was the increase in namespace availability
which is much easier with Postgres anyway.

Just my 0.02€






--
View this message in context: 
http://postgresql.nabble.com/Packages-Again-tp5938583p5938892.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNDO and in-place update

2017-01-13 Thread Amit Kapila
On Thu, Jan 12, 2017 at 8:56 PM, Robert Haas  wrote:
> On Wed, Jan 11, 2017 at 5:18 AM, Amit Kapila  wrote:
>> Moving further, I have thought about consistent reads and different
>> formats for storing undo with pros and cons of each format.
>>
>> Consistent Reads
>> 
>> If the page is modified by any transaction that is not visible to read
>> transaction's snapshot, we have to reconstruct a copy of the page.
>> Now there could be multiple ways to achieve it:
>>
>> 1. Apply the undo for the transactions that are not visible to read
>> transaction and keep the copy of the page.
>> 2. Apply the undo for all the transactions in TPD entry of a page.
>>
>> I think we can't do second because this can create rows which are not
>> consistent due to apply of undo log for transactions which are visible
>> to read transaction.  Now, where the first approach will work, but it
>> might turn out to be inefficient if not implemented carefully, as for
>> each read transaction we need to create separate copies of the page
>> where a different set of transactions are not visible to different
>> read transactions. Even for read transactions to which the same set of
>> transactions are not visible, we need some way to recognize the
>> version of the page that can be used, probably try with all the
>> version of pages and see if any of the existing versions can serve the
>> purpose.  I think we also need some way to link the different versions
>> of the page in shared buffers so that before trying to create a new
>> version of the page we can look at linked versions.
>>
>> Yet another consideration in this area could be to perform consistent
>> reads differently for index scans.  For index scan, we will receive a
>> particular TID to read, so, in this case, we can try to reconstruct
>> the old image of that particular row.  Now, whereas this sounds
>> efficient, but we might need to reconstruct the old image of rows
>> multiple times for different read transactions. The effect will be
>> magnificent when we have to perform this for multiple rows of a page.
>> I think it might be better to always construct a complete copy of page
>> if one or more transactions that have changed the page are not visible
>> to read snapshot.
>
> I was thinking about this whole area somewhat differently.  We don't
> really need a complete imagine of the page.  We just need to find the
> correct version of each tuple that is visible.  So I would suggest
> that we consider avoiding page reconstruction altogether.  A
> sequential scan iterates through each TID on the page and says "is
> this tuple visible?".  Right now the answer is either "yes" or "no",
> but maybe the answer could be "yes", "no", or "look at this other
> version in the UNDO segment instead".  The advantage of this approach
> is that we don't end up copying tuples into a new "reconstructed"
> page.  That copying will take CPU time and the resulting page will
> consume cache space.
>
> Now, the problem with this is that we want to avoid walking through
> the whole UNDO chain for the page multiple times.  But maybe there are
> ways to avoid that -- for starters, if the recently-modified bit on
> the page isn't set, we don't need to walk through the UNDO chain at
> all.  For a sequential scan, we could maintain a backend-local cache
> of information that gets updated as we walk the chain.  Perhaps
> there's no way to make these ideas work and we're going to end up
> doing some form of page reconstruction, but I think it's worth some
> thought anyway.
>

Sure, we can do that way and I agree that it is worth considering.
Few cases where it can be really costly is when undo chain overflows
to multiple pages and those pages doesn't exist in cache.  I think the
reason why it is probable is that undo chain is not maintained for row
update, rather it is maintained for a page level changes for each
transaction.  So, what this means is that if we have to traverse the
chain for record X, then I think it is quite possible that during
chain traversal, we will encounter undo of record Y, ofcourse we can
identify and ignore it, but certainly the chain can be much longer.
Another kind of cost could be construction of actual record from undo
record, for example if we consider to undo log only changed parts of
Update as we do for WAL, then we have to incur some record
construction cost as well. I think what can make it really costly is
repeating this work.  Even, if we consider the above as a worse case,
I am not sure whether it will be cheap for short transactions like
pgbench as there could be cases where concurrent updates/reads needs
to traverse undo chains.

Having said that, if you don't feel strongly about caching the
information in such a way that it can be reused by different sessions,
then we can initially implement the system as outlined by you, then we
can extend it based on the performance characteristics.


>
> I 

Re: [HACKERS] Passing query string to workers

2017-01-13 Thread Rafia Sabih
On Thu, Jan 12, 2017 at 6:24 PM, Robert Haas  wrote:
> On Wed, Jan 11, 2017 at 11:12 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> On Wed, Jan 11, 2017 at 6:36 PM, Tom Lane  wrote:
 That would work, if you had a way to get at the active QueryDesc ...
 but we don't pass that down to executor nodes.
>>
>>> Hmm, that is a bit of a problem.  Do you have a suggestion?
>>
>> Copy that string pointer into the EState, perhaps?
>
> OK, that sounds good.

Thanks a lot Tom and Robert for the suggestions. Please find the
attached file for the revised version of the patch, wherein I added an
extra pointer for queryString in EState and populated it with
queryDesc->sourceText in standard_ExecutorStart. Later, in
ExecInitParallelPlan a token for queryString is created in shared
memory which is used in ExecParallelGetQueryDesc and
ParallelWorkerMain as before (in version 1 of patch).

Please let me know your feedback over the same.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


pass_queryText_to_workers_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slot xmin is not reset if HS feedback is turned off while standby is shut down

2017-01-13 Thread Ants Aasma
On 5 Jan 2017 2:54 a.m., "Craig Ringer"  wrote:
> Ants, do you think you'll have a chance to convert your shell script
> test into a TAP test in src/test/recovery?
>
> Simon has said he would like to commit this fix. I'd personally be
> happier if it had a test to go with it.
>
> You could probably just add to src/test/recover/t/001 which now
> contains my additions for hot standby.

Do you feel the test in the attached patch is enough or would you like
to see anything else covered?

Regards,
Ants Aasma
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index c6b54ec..0ab936f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1157,7 +1157,10 @@ XLogWalRcvSendReply(bool force, bool requestReply)
  * in case they don't have a watch.
  *
  * If the user disables feedback, send one final message to tell sender
- * to forget about the xmin on this standby.
+ * to forget about the xmin on this standby. We also send this message
+ * on first connect because a previous connection might have set xmin
+ * on a replication slot. (If we're not using a slot it's harmless to
+ * send a feedback message explicitly setting InvalidTransactionId).
  */
 static void
 XLogWalRcvSendHSFeedback(bool immed)
@@ -1167,7 +1170,8 @@ XLogWalRcvSendHSFeedback(bool immed)
 	uint32		nextEpoch;
 	TransactionId xmin;
 	static TimestampTz sendTime = 0;
-	static bool master_has_standby_xmin = false;
+	/* initially true so we always send at least one feedback message */
+	static bool master_has_standby_xmin = true;
 
 	/*
 	 * If the user doesn't want status to be reported to the master, be sure
@@ -1192,14 +1196,17 @@ XLogWalRcvSendHSFeedback(bool immed)
 	}
 
 	/*
-	 * If Hot Standby is not yet active there is nothing to send. Check this
-	 * after the interval has expired to reduce number of calls.
+	 * If Hot Standby is not yet accepting connections there is nothing to
+	 * send. Check this after the interval has expired to reduce number of
+	 * calls.
+	 *
+	 * Bailing out here also ensures that we don't send feedback until we've
+	 * read our own replication slot state, so we don't tell the master to
+	 * discard needed xmin or catalog_xmin from any slots that may exist
+	 * on this replica.
 	 */
 	if (!HotStandbyActive())
-	{
-		Assert(!master_has_standby_xmin);
 		return;
-	}
 
 	/*
 	 * Make the expensive call to get the oldest xmin once we are certain
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index eef512d..7fb2e9e 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 22;
+use Test::More tests => 24;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -161,3 +161,22 @@ is($catalog_xmin, '', 'non-cascaded slot xmin still null with hs_feedback reset'
 ($xmin, $catalog_xmin) = get_slot_xmins($node_standby_1, $slotname_2);
 is($xmin, '', 'cascaded slot xmin null with hs feedback reset');
 is($catalog_xmin, '', 'cascaded slot xmin still null with hs_feedback reset');
+
+diag "re-enabling hot_standby_feedback and disabling while stopped";
+$node_standby_2->safe_psql('postgres', 'ALTER SYSTEM SET hot_standby_feedback = on;');
+$node_standby_2->reload;
+
+$node_master->safe_psql('postgres', qq[INSERT INTO tab_int VALUES (11000);]);
+replay_check();
+
+$node_standby_2->safe_psql('postgres', 'ALTER SYSTEM SET hot_standby_feedback = off;');
+$node_standby_2->stop;
+
+($xmin, $catalog_xmin) = get_slot_xmins($node_standby_1, $slotname_2);
+isnt($xmin, '', 'cascaded slot xmin non-null with postgres shut down');
+
+# Xmin from a previous run should be cleared on startup.
+$node_standby_2->start;
+
+($xmin, $catalog_xmin) = get_slot_xmins($node_standby_1, $slotname_2);
+is($xmin, '', 'cascaded slot xmin reset after startup with hs feedback reset');

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Wolfgang Wilhelm
Hello again,
well, I didn't want to tell you to match in PostgreSQL the technical debt. If I 
made it look so, sorry for my bad english.
No, a "free clone of Oracle" isn't my intention. I don't want to convince 
Oracle evangelists to use PostgreSQL. This is time wasted. But I'd prefer a 
project where the community is thriving and the more members it has the better.
My top point on the list above is making the way away from Oracle smoother and 
that's ways easier when you don't have to argue against arguments like "PG 
doesn't have packages" including that what was mentioned as pro arguments for 
packages by Tom Kyte. This is the best way to stay in a discussion about that 
single topic and have _no_ chance to show them the benefits.

Regards,Wolfgang
 

Pavel Stehule  schrieb am 10:41 Freitag, 13.Januar 
2017:
 

 

2017-01-13 10:11 GMT+01:00 Craig Ringer :

On 13 January 2017 at 16:49, Wolfgang Wilhelm  wrote:
> - Devs just don't want to change (some) code. Everybody seems to have code
> with huge technical debt which is best not to be touched. This code should
> have an easy "move code from Oracle to PostgreSQL", best case by not forcing
> the devs to look at this scary code.

That's kind of offsetting their technical debt onto us, though, and by
extension other PostgreSQL users.

Support compatibility stuff we don't need, that doesn't really benefit
other users, just so they can do less porting and cleanup work.

I'm 100% for implementing _useful_ features based on users' migration
needs from other DBMSes. Even ones that aren't that useful, but come
at low cost to us. But introducing whole new systems to make porting a
little easier does not thrill me.

Also, that's where EDB's market is, and I don't personally feel any
desire to push community PostgreSQL in the Oracle compatibility
direction. Porting tools, sure. Useful features, sure. Direct
compatibility, meh.

I guess what I'm saying is that if someone wants PostgreSQL to have
packages, they need to have:

* A design that can fit in with PostgreSQL
* Solid benefits beyond "makes life easier for Oracle users" to
justify each feature/change
* Funding/time to make it happen

So far, I haven't seen anyone with one of those, let alone all three.


+1
Regards
Pavel 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/ mailpref/pgsql-hackers




   

Re: [HACKERS] pgbench - allow backslash continuations in \set expressions

2017-01-13 Thread Rafia Sabih
On Wed, Jan 11, 2017 at 5:39 PM, Fabien COELHO  wrote:
>
> Hello,
>
>>> The attached patch adds backslash-return (well newline really)
>>> continuations
>>> to all pgbench backslash-commands.
>>>
>>> The attached test uses continuations on all such commands (sleep set
>>> setshell and shell).
>>>
>>> I think that adding continuations to psql should be a distinct patch.
>
>
>> The patch does not apply on the latest head, I guess this requires
>> rebasing since yours is posted in December.
>
>
> Strange. Here is a new version and a test for all known backslash-commands
> in pgbench.
>
>   sh> git br test master
>   sh> git apply ~/pgbench-continuation-3.patch
>   # ok
>   sh> git diff
>   # ok
>   sh> cd src/bin/pgbench
>   sh> make
>   sh> ./pgbench -t 1 -f SQL/cont.sql
>   starting vacuum...end.
>   debug(script=0,command=1): int 0
>   debug(script=0,command=2): int 1
>   debug(script=0,command=4): int 2
>   3
>   debug(script=0,command=8): int 4
>   transaction type: SQL/cont.sql
>
>> Again, it is giving trailing whitespace errors (as I reported for the
>> earlier version), plus it does not apply with git apply,
>
>
> It does above with the attached version.

It still gives me whitespace errors with git apply,

/Users/edb/Downloads/pgbench-continuation-3.patch:31: trailing whitespace.
continuation \\{newline}
/Users/edb/Downloads/pgbench-continuation-3.patch:39: trailing whitespace.
{continuation} { /* ignore */ }
/Users/edb/Downloads/pgbench-continuation-3.patch:40: trailing whitespace.
/Users/edb/Downloads/pgbench-continuation-3.patch:48: trailing whitespace.
{continuation} { /* ignore */ }
/Users/edb/Downloads/pgbench-continuation-3.patch:49: trailing whitespace.

error: patch failed: doc/src/sgml/ref/pgbench.sgml:810
error: doc/src/sgml/ref/pgbench.sgml: patch does not apply
error: patch failed: src/bin/pgbench/exprscan.l:65
error: src/bin/pgbench/exprscan.l: patch does not apply

Looks like an editor issue, I used awk '{ sub("\r$", ""); print }'
patch1 > patch2 to clean it and it was applying then.

>> hopefully that would be fixed once rebased. Other than that, I observed
>> that if after backslash space is there, then the command fails.
>
>
> Yes, this is expected.
>
>> I think it should be something like if after backslash some spaces are
>> there, followed by end-of-line then it should ignore these spaces and read
>> next line, atleast with this new meaning of backslash.
>
>
> Hmmm. This is not the behavior of backslash continuation in bash or python,
> I do not think that this is desirable to have a different behavior.
>
Okay, seems sensible.
>> Otherwise, it should be mentioned in the docs that backslash should not be
>> followed by space.
>
>
> I'm not sure. Doc says that continuation is "backslash-return", it cannot be
> more explicit. If it must say what it is not, where should it stop?
>
> --
> Fabien.

Please post the clean version and I'll mark it as ready for committer then.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-01-13 Thread Pavel Stehule
2017-01-13 10:11 GMT+01:00 Craig Ringer :

> On 13 January 2017 at 16:49, Wolfgang Wilhelm 
> wrote:
> > - Devs just don't want to change (some) code. Everybody seems to have
> code
> > with huge technical debt which is best not to be touched. This code
> should
> > have an easy "move code from Oracle to PostgreSQL", best case by not
> forcing
> > the devs to look at this scary code.
>
> That's kind of offsetting their technical debt onto us, though, and by
> extension other PostgreSQL users.
>
> Support compatibility stuff we don't need, that doesn't really benefit
> other users, just so they can do less porting and cleanup work.
>
> I'm 100% for implementing _useful_ features based on users' migration
> needs from other DBMSes. Even ones that aren't that useful, but come
> at low cost to us. But introducing whole new systems to make porting a
> little easier does not thrill me.
>
> Also, that's where EDB's market is, and I don't personally feel any
> desire to push community PostgreSQL in the Oracle compatibility
> direction. Porting tools, sure. Useful features, sure. Direct
> compatibility, meh.
>
> I guess what I'm saying is that if someone wants PostgreSQL to have
> packages, they need to have:
>
> * A design that can fit in with PostgreSQL
> * Solid benefits beyond "makes life easier for Oracle users" to
> justify each feature/change
> * Funding/time to make it happen
>
> So far, I haven't seen anyone with one of those, let alone all three.
>

+1

Regards

Pavel


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


[HACKERS] Typo in hashsearch.c

2017-01-13 Thread Mithun Cy
There is a typo in comments of function _hash_first(); Adding a fix for same.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


hashsearch_typo01.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql - additional extra checks

2017-01-13 Thread Pavel Stehule
2017-01-13 2:46 GMT+01:00 Jim Nasby :

> On 1/11/17 5:54 AM, Pavel Stehule wrote:
>
>> +too_many_rows
>> +
>> + 
>> +  When result is assigned to a variable by INTO
>> clause,
>> +  checks if query returns more than one row. In this case the
>> assignment
>> +  is not deterministic usually - and it can be signal some issues in
>> design.
>>
>
> Shouldn't this also apply to
>
> var := blah FROM some_table WHERE ...;
>

declare x int;
begin
  x := i from generate_series(1,1) g(i);
  raise notice 'x=%', x;
end;
$$;
NOTICE:  x=1
DO

postgres=# do $$
declare x int;
begin
  x := i from generate_series(1,2) g(i);
  raise notice 'x=%', x;
end;
$$;
ERROR:  query "SELECT i from generate_series(1,2) g(i)" returned more than
one row
CONTEXT:  PL/pgSQL function inline_code_block line 4 at assignment

so extra check is not required in this case


>
> ?
>
> AIUI that's one of the beefs the plpgsql2 project has.
>

uff - I hope so plpgsql2 will carry some less scary - against the clean
syntax

x := (select .. )

you save 8 chars. And now the SELECT doesn't look like SELECT - the
statement was broken. This feature is just side effect of plpgsql quick (in
old time little bit poor) design. It is not allowed in PL/SQL and it is not
allowed by SQL/PSM.


>
> FWIW, I'd also be in favor of a NOMULTI option to INTO, but I don't see
> any way to do something like that with var := blah FROM.


This is proposed as check for current living code, where you should not to
modify source code.

We can speak about introduce new keyword or new syntax - but it should be
different thread.




>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)
>


Re: [HACKERS] Packages: Again

2017-01-13 Thread Craig Ringer
On 13 January 2017 at 16:49, Wolfgang Wilhelm  wrote:
> - Devs just don't want to change (some) code. Everybody seems to have code
> with huge technical debt which is best not to be touched. This code should
> have an easy "move code from Oracle to PostgreSQL", best case by not forcing
> the devs to look at this scary code.

That's kind of offsetting their technical debt onto us, though, and by
extension other PostgreSQL users.

Support compatibility stuff we don't need, that doesn't really benefit
other users, just so they can do less porting and cleanup work.

I'm 100% for implementing _useful_ features based on users' migration
needs from other DBMSes. Even ones that aren't that useful, but come
at low cost to us. But introducing whole new systems to make porting a
little easier does not thrill me.

Also, that's where EDB's market is, and I don't personally feel any
desire to push community PostgreSQL in the Oracle compatibility
direction. Porting tools, sure. Useful features, sure. Direct
compatibility, meh.

I guess what I'm saying is that if someone wants PostgreSQL to have
packages, they need to have:

* A design that can fit in with PostgreSQL
* Solid benefits beyond "makes life easier for Oracle users" to
justify each feature/change
* Funding/time to make it happen

So far, I haven't seen anyone with one of those, let alone all three.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


WG: [HACKERS] Packages: Again

2017-01-13 Thread Wolfgang Wilhelm




 Hello,
just my 2c on the topic.I work for an IT service provider as developer. Our 
customers are big international companies and they use Oracle a lot. My main 
work is with Oracle enterprise edition. I don't have to care about limitations 
of the smaller versions and don't have experience with users, that is 
enterprises, of that software.

- I wouldn't subscribe to Pavels opinion on the fees. Most Oracle Devs in mid- 
oder large size enterprises don't care about the costs of the database.  They 
even don't know because the purchase department will do the job and they don't 
bandy out how much discount they got from the ridiculous price list. 

- Devs just don't want to change (some) code. Everybody seems to have code with 
huge technical debt which is best not to be touched. This code should have an 
easy "move code from Oracle to PostgreSQL", best case by not forcing the devs 
to look at this scary code.
- The more difficult a database change including rewriting of code will get the 
less likely you'll find something paying for it. In my case there is a list of 
reasons from the customer _not_ to switch from Oracle to PostgreSQL. Besides 
more obvious reasons like APEX applications on the list there are things like 
"complicated PL/SQL code e.g. ... packages..." (whatever complicated is). Lots 
of the other reasons on that list begin to blur because of the changes of the 
recent versions or the near future like parallelisation or working on 
partitions.
Of course there are some questions about style, maintainability... But this 
would be another post.

RegardsWolfgang


   

   

Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-13 Thread Gilles Darold
Le 13/01/2017 à 05:26, Michael Paquier a écrit :
> OK. I have been looking at this patch.
>
> git diff --check is very unhappy, particularly because of
> update_metainfo_datafile().
Sorry, fixed.

> +
> +When logs are written to the file-system their paths, names, and
> +types are recorded in
> +the  file.  This provides
> +a convenient way to find and access log content without establishing a
> +database connection.
> +
> File system is used as a name here. In short, "file-system" -> "file
> system". I think that it would be a good idea to show up the output of
> this file. This could be reworded a bit:
> "When logs are written to the file system, the  linkend="storage-pgdata-current-logfiles"> file includes the location,
> the name and the type of the logs currently in use. This provides a
> convenient way to find the logs currently in used by the instance."
Changes applied. Output example added:

 
$ cat $PGDATA/current_logfiles
stderr pg_log/postgresql-2017-01-13_085812.log



> @@ -15441,6 +15441,19 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY
> AS t(ls,n);
>
>
>
> +   
> pg_current_logfile()
> +   text
> +   primary log file name in use by the logging collector
> +  
> +
> +  
> +   
> pg_current_logfile(text)
> +   text
> +   log file name, of log in the requested format, in use by the
> +   logging collector
> +  
> You could just use one line, using squared brackets for the additional
> argument. The description looks imprecise, perhaps something like that
> would be better: "Log file name currently in use by the logging
> collector".
OK, back to a single row entry with optional parameter.

> +/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
> +/*
> + * Name of file holding the paths, names, and types of the files where 
> current
> + * log messages are written, when the log collector is enabled.  Useful
> + * outside of Postgres when finding the name of the current log file in the
> + * case of time-based log rotation.
> + */
> Not mentioning the file names here would be better. If this spreads in
> the future updates would likely be forgotten. This paragraph could be
> reworked: "configuration file saving meta-data information about the
> log files currently in use by the system logging process".
Removed.

> --- a/src/include/miscadmin.h
> +++ b/src/include/miscadmin.h
> @@ -466,5 +466,4 @@ extern bool has_rolreplication(Oid roleid);
>  /* in access/transam/xlog.c */
>  extern bool BackupInProgress(void);
>  extern void CancelBackup(void);
> -
>  #endif   /* MISCADMIN_H */
> Unrelated diff.
Removed

> +   /*
> +* Force rewriting last log filename when reloading configuration,
> +* even if rotation_requested is false, log_destination may have
> +* been changed and we don't want to wait the next file rotation.
> +*/
> +   update_metainfo_datafile();
> +
> }
> I know I am famous for being noisy on such things, but please be
> careful with newline use..
That's ok for me, unnecessary newline removed.

> +   else
> +   {
> +   /* Unknown format, no space. Return NULL to caller. */
> +   lbuffer[0] = '\0';
> +   break;
> +   }
> Hm. I would think that an ERROR is more useful to the user here.
The problem addressed here might never happen unless file corruption but
if you know an error code that can be use in this case I will add the
error message. I can not find any error code usable in this case.

> Perhaps pg_current_logfile() should be marked as STRICT?
I don't think so, the log format parameter is optional, and passing NULL
might be like passing no parameter.

> Would it make sense to have the 0-argument version be a SRF returning
> rows of '(log_destination,location)'? We could just live with this
> function I think, and let the caller filter by logging method.
No, this case have already been eliminated during the previous review.

> +is NULL.  When multiple logfiles exist, each in a
> +different format, pg_current_logfile called
> s/logfiles/log files/.
Applied.

> Surely the temporary file of current_logfiles should not be included
> in base backups (look at basebackup.c). Documentation needs to reflect
> that as well. Also, should we have current_logfiles in a base backup?
> I would think no.
Done but can't find any documentation about the file exclusion, do you
have a pointer?

> pg_current_logfile() can be run by *any* user. We had better revoke
> its access in system_views.sql!
Why? There is no special information returned by this function unless
the path but it can be retrieve using show log_directory.

Attached is patch v20.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 30dd54c..393195f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4211,6 

Re: [HACKERS] Parallel Index-only scan

2017-01-13 Thread Rafia Sabih
On Thu, Jan 12, 2017 at 5:39 PM, Rahila Syed  wrote:
> Hello,
>
> On applying the patch on latest master branch and running regression tests
> following failure occurs.
> I applied it on latest parallel index scan patches as given in the link
> above.
>
> ***
> /home/rahila/postgres/postgres/src/test/regress/expected/select_parallel.out
> 2017-01-03 14:06:29.122022780 +0530
> ---
> /home/rahila/postgres/postgres/src/test/regress/results/select_parallel.out
> 2017-01-12 14:35:56.652712622 +0530
> ***
> *** 92,103 
>   explain (costs off)
> select  sum(parallel_restricted(unique1)) from tenk1
> group by(parallel_restricted(unique1));
> !  QUERY PLAN
> ! 
>HashAggregate
>  Group Key: parallel_restricted(unique1)
> !->  Index Only Scan using tenk1_unique1 on tenk1
> ! (3 rows)
>
>   set force_parallel_mode=1;
>   explain (costs off)
> --- 92,105 
>   explain (costs off)
> select  sum(parallel_restricted(unique1)) from tenk1
> group by(parallel_restricted(unique1));
> ! QUERY PLAN
> ! ---
>HashAggregate
>  Group Key: parallel_restricted(unique1)
> !->  Gather
> !  Workers Planned: 4
> !  ->  Parallel Index Only Scan using tenk1_unique1 on tenk1
> ! (5 rows)
>
> IIUC, parallel operation being performed here is fine as parallel restricted
> function occurs above Gather node
> and just the expected output needs to be changed.
>

True, fixed it, please find the attached file for the latest version.

> Thank you,
> Rahila Syed
>
Thanks a lot for the review Rahila.
-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


parallel_index_only_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Kyotaro HORIGUCHI
Hello,

It is big, but I think quite reasonable. The refactoring makes
many things clearer and I like it. No objection for the
patch. The affect of changing some API's are not a problem.

At Thu, 12 Jan 2017 19:00:47 +0100 (CET), Fabien COELHO  
wrote in 
> Patch applies (with "patch", "git apply" did not like it though),
> compiles, overall make check ok, pgss make check ok as well. I do not
> know whether the coverage of pg tests is enough to know whether
> anything is broken...

The same for me. There's some parts that I haven't fully
understand, though..

> I noticed that you also improved the pgss fix and tests. Good. I was
> unsure about removing spaces at the end of the query because of
> possible encoding issues.
> 
> The update end trick is nice to deal with empty statements. I wonder
> whether the sub "query" in Copy, View, CreateAs, Explain, Prepare,
> Execute, Declare... could be typed RawStmt* instead of Node*. That
> would avoid some casts in updateRawStmtEnd, but maybe add some other
> elsewhere.
> 
> One point bothered me a bit when looking at the initial code, that
> your refactoring has not changed: the location is about a string which
> is not accessible from the node, so that the string must be kept
> elsewhere and passed as a second argument here and there. ISTM that it
> would make some sense to have the initial string directly available
> from RawStmt, Query and PlannedStmt.

I found an inconsistency (in style, not function:) between
copyfuncs and equalfuncs. The patch handles the three members
utilityStmt, stmt_location and stmt_len of Query at once and
copyfuncs seems to be edited to follow the rule, but in
equalfuncs the position of utilityStmt is not moved.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-13 Thread Kyotaro HORIGUCHI
Hello,

At Thu, 12 Jan 2017 20:08:54 +0100 (CET), Fabien COELHO  
wrote in 
> 
> About having a pointer to the initial string from RawStmt, Query &
> PlannedStmt:
> 
> > I remembered one reason why we haven't done this: it's unclear how
> > we'd handle copying if we do it. If, say, Query contains a "char *"
> > pointer then you'd expect copyObject() to pstrdup that string, [...,
> > So] We'd need to work out a way of managing multiple Queries carrying
> > references to the same source string, and it's not clear how to do
> > that reasonably.
> 
> For me it would be shared, but then it may break some memory
> management hypothesis downstream.

+1 to they have a pointer to the shared query string. But doing
that without some measure like reference counting seems
difficult..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Protect syscache from bloating with negative cache entries

2017-01-13 Thread Michael Paquier
On Wed, Dec 21, 2016 at 5:10 AM, Tom Lane  wrote:
> If I thought that "every ten minutes" was an ideal way to manage this,
> I might worry about that, but it doesn't really sound promising at all.
> Every so many queries would likely work better, or better yet make it
> self-adaptive depending on how much is in the local syscache.
>
> The bigger picture here though is that we used to have limits on syscache
> size, and we got rid of them (commit 8b9bc234a, see also
> https://www.postgresql.org/message-id/flat/5141.1150327541%40sss.pgh.pa.us)
> not only because of the problem you mentioned about performance falling
> off a cliff once the working-set size exceeded the arbitrary limit, but
> also because enforcing the limit added significant overhead --- and did so
> whether or not you got any benefit from it, ie even if the limit is never
> reached.  Maybe the present patch avoids imposing a pile of overhead in
> situations where no pruning is needed, but it doesn't really look very
> promising from that angle in a quick once-over.

Have there been ever discussions about having catcache entries in a
shared memory area? This does not sound much performance-wise, I am
just wondering about the concept and I cannot find references to such
discussions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers