Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()

2016-07-29 Thread Pavel Stehule
2016-07-30 3:47 GMT+02:00 David G. Johnston :

> On Fri, Jul 29, 2016 at 8:18 PM, Bruce Momjian  wrote:
>
>> On Tue, Jul 12, 2016 at 01:36:38PM +, thomas.ber...@1und1.de wrote:
>> > The following bug has been logged on the website:
>> >
>> > Bug reference:  14244
>> > Logged by:  Thomas Berger
>> > Email address:  thomas.ber...@1und1.de
>> > PostgreSQL version: 9.5.3
>> > Operating system:   any
>> > Description:
>> >
>> > pg_size_pretty uses the suffix "kB" (kilobyte, 10^3 byte), but the
>> returned
>> > value is "KB", or "KiB" ( kibibyte, 2^10 byte). This is missleading and
>> > should be fixed. See also https://en.wikipedia.org/wiki/Kibibyte
>> >
>> > =# select pg_size_pretty(1024000::bigint);
>> >  pg_size_pretty
>> > 
>> >  1000 kB
>>
>> (Thread moved to hackers.)
>>
>> Yes, we have redefined kB, and documented its use in postgresql.conf and
>> pg_size_pretty(), but it does not match any recognized standard.
>>
>
> ​After bouncing on this for a bit I'm inclined to mark the bug itself
> "won't fix" but introduce a "to_binary_iso" function (I'm hopeful a better
> name will emerge...) that will output a number using ISO binary suffixes.
> I would document this under 9.8 "data type formatting functions" instead of
> within system functions.
>
> pg_size_pretty output can continue with a defined role to be used as input
> into a GUC variable; and to keep backward compatibility.  Add a note near
> its definition to use "to_binary_iso" for a standard-conforming output
> string.
>

We talked about this issue, when I wrote function pg_size_bytes. It is hard
to fix these functions after years of usage. The new set of functions can
be better

pg_iso_size_pretty();
pg_iso_size_bytes();

or shorter name

pg_isize_pretty();
pg_isize_bytes();

Regards

Pavel


>
> ​David J.
>


Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

2016-07-29 Thread Pavel Stehule
2016-07-30 1:46 GMT+02:00 Tomas Vondra :

> Hi,
>
> On 07/29/2016 01:15 PM, Aleksander Alekseev wrote:
>
>> Hello
>>
>> Some time ago we discussed an idea of "fast temporary tables":
>>
>> https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu
>>
>> In two words the idea is following.
>>
>> 
>>
>> PostgreSQL stores information about all relations in pg_catalog. Some
>> applications create and delete a lot of temporary tables. It causes a
>> bloating of pg_catalog and running auto vacuum on it. It's quite an
>> expensive operation which affects entire database performance.
>>
>> We could introduce a new type of temporary tables. Information about
>> these tables is stored not in a catalog but in backend's memory. This
>> way user can solve a pg_catalog bloating problem and improve overall
>> database performance.
>>
>> 
>>
>
> Great! Thanks for the patch, this is definitely an annoying issue worth
> fixing. I've spent a bit of time looking at the patch today, comments below
> ...
>

Yes, it some what we need long time

>
>
>> I took me a few months but eventually I made it work. Attached patch
>> has some flaws. I decided not to invest a lot of time in documenting
>> it or pgindent'ing all files yet. In my experience it will be rewritten
>> entirely 3 or 4 times before merging anyway :) But it _works_ and
>> passes all tests I could think of, including non-trivial cases like
>> index-only or bitmap scans of catalog tables.
>>
>>
> Well, jokes aside, that's a pretty lousy excuse for not writing any docs,
> and you're pretty much asking the reviewers to reverse-engineer your
> reasoning. So I doubt you'll get many serious reviews without fixing this
> gap.
>
> Usage example:
>>
>> ```
>> CREATE FAST TEMP TABLE fasttab_test1(x int, s text);
>>
>> INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
>>
>> UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2;
>>
>> DELETE FROM fasttab_test1 WHERE x = 3;
>>
>> SELECT * FROM fasttab_test1 ORDER BY x;
>>
>> DROP TABLE fasttab_test1;
>> ```
>>
>> More sophisticated examples could be find in regression tests:
>>
>> ./src/test/regress/sql/fast_temp.sql
>>
>> Any feedback on this patch will be much appreciated!
>>
>>
>>
> 1) I wonder whether the FAST makes sense - does this really change the
> performance significantly? IMHO you only move the catalog rows to memory,
> so why should the tables be any faster? I also believe this conflicts with
> SQL standard specification of CREATE TABLE.
>

Probably has zero value to have slow and fast temp tables (from catalogue
cost perspective). So the FAST implementation should be used everywhere.
But there are some patterns used with work with temp tables,that should not
working, and we would to decide if we prepare workaround or not.

-- problematic pattern (old code)
IF NOT EXISTS(SELECT * FROM pg_class WHERE ) THEN
  CREATE TEMP TABLE xxx()
ELSE
  TRUNCATE TABLE xxx;
END IF;

-- modern patter (new code)
BEGIN
  TRUNCATE TABLE xxx;
EXCEPTION WHEN . THEN
  CREATE TEMP TABLE(...)
END;

In this case we can use GUC, because visible behave should be same.

The benefit of zero catalogue cost temp tables is significant - and for
some larger applications the temp tables did hard performance issues.


>
> 2) Why do we need the new relpersistence value? ISTM we could easily got
> with just RELPERSISTENCE_TEMP, which would got right away of many chances
> as the steps are exactly the same.
>
> IMHO if this patch gets in, we should use it as the only temp table
> implementation (Or can you think of cases where keeping rows in pg_class
> has advantages?). That'd also eliminate the need for FAST keyword in the
> CREATE TABLE command.
>
> The one thin I'm not sure about is that our handling of temporary tables
> is not standard compliant - we require each session to create it's own
> private temporary table. Moving the rows from pg_class into backend memory
> seems to go in the opposite direction, but as no one was planning to fix
> this, I don't think it matters much.
>
> 3) I think the heapam/indexam/xact and various other places needs a major
> rework. You've mostly randomly sprinkled the code with function calls to
> make the patch work - that's fine for an initial version, but a more
> principled approach is needed.
>
> 4) I'm getting failures in the regression suite - apparently the patch
> somehow affects costing of index only scans, so that a several queries
> switch from index only scans to bitmap index scans etc. I haven't
> investigated this more closely, but it seems quite consistent (and I don't
> see it without the patch). It seems the patch delays building of visibility
> map, or something like that.
>

Some other random notes:

1. With this code should not be hard to implement global temp tables -
shared persistent structure, temp local data - significant help for any who
have to migrate from Oracle.

2. This should to work on slaves - it is one of ToDo

3. I 

Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()

2016-07-29 Thread David G. Johnston
On Fri, Jul 29, 2016 at 8:18 PM, Bruce Momjian  wrote:

> On Tue, Jul 12, 2016 at 01:36:38PM +, thomas.ber...@1und1.de wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference:  14244
> > Logged by:  Thomas Berger
> > Email address:  thomas.ber...@1und1.de
> > PostgreSQL version: 9.5.3
> > Operating system:   any
> > Description:
> >
> > pg_size_pretty uses the suffix "kB" (kilobyte, 10^3 byte), but the
> returned
> > value is "KB", or "KiB" ( kibibyte, 2^10 byte). This is missleading and
> > should be fixed. See also https://en.wikipedia.org/wiki/Kibibyte
> >
> > =# select pg_size_pretty(1024000::bigint);
> >  pg_size_pretty
> > 
> >  1000 kB
>
> (Thread moved to hackers.)
>
> Yes, we have redefined kB, and documented its use in postgresql.conf and
> pg_size_pretty(), but it does not match any recognized standard.
>

​After bouncing on this for a bit I'm inclined to mark the bug itself
"won't fix" but introduce a "to_binary_iso" function (I'm hopeful a better
name will emerge...) that will output a number using ISO binary suffixes.
I would document this under 9.8 "data type formatting functions" instead of
within system functions.

pg_size_pretty output can continue with a defined role to be used as input
into a GUC variable; and to keep backward compatibility.  Add a note near
its definition to use "to_binary_iso" for a standard-conforming output
string.

​David J.


Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()

2016-07-29 Thread Bruce Momjian
On Fri, Jul 29, 2016 at 08:18:38PM -0400, Bruce Momjian wrote:
> However, that is not the end of the story.  Things have moved forward
> since 2006 and there is now firm support for either KB or KiB to be
> 1024-based units.  This blog post explains the current state of prefix
> specification:
> 
>   
> http://pchelp.ricmedia.com/kilobytes-megabytes-gigabytes-terabytes-explained/
> 
> and here is a summary for 1000/1024-based units:
> 
>   Kilobyte (Binary, JEDEC)KB  1024
>   Kilobyte (Decimal, Metric)  kB  1000
>   Kibibyte (Binary, IEC)  KiB 1024

Oh, also, here is a Wikipedia article that has a nice chart on the top
right:

https://en.wikipedia.org/wiki/Binary_prefix

and a post that explains some of the background:


http://superuser.com/questions/938234/size-of-files-in-windows-os-its-kb-or-kb

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()

2016-07-29 Thread Bruce Momjian
On Tue, Jul 12, 2016 at 01:36:38PM +, thomas.ber...@1und1.de wrote:
> The following bug has been logged on the website:
> 
> Bug reference:  14244
> Logged by:  Thomas Berger
> Email address:  thomas.ber...@1und1.de
> PostgreSQL version: 9.5.3
> Operating system:   any
> Description:
> 
> pg_size_pretty uses the suffix "kB" (kilobyte, 10^3 byte), but the returned
> value is "KB", or "KiB" ( kibibyte, 2^10 byte). This is missleading and
> should be fixed. See also https://en.wikipedia.org/wiki/Kibibyte
> 
> =# select pg_size_pretty(1024000::bigint);   
>  pg_size_pretty 
> 
>  1000 kB

(Thread moved to hackers.)

The Postgres docs specify that kB is based on 1024 or 2^10:

https://www.postgresql.org/docs/9.6/static/functions-admin.html

Note: The units kB, MB, GB and TB used by the functions
pg_size_pretty and pg_size_bytes are defined using powers of 2 rather
than powers of 10, so 1kB is 1024 bytes, 1MB is 10242 = 1048576 bytes,
and so on.

These prefixes were introduced to GUC variable specification in 2006:

commit b517e653489f733893d61e7a84c118325394471c
Author: Peter Eisentraut 
Date:   Thu Jul 27 08:30:41 2006 +

Allow units to be specified with configuration settings.

and added to postgresql.conf:

# Memory units:  kB = kilobytesTime units:  ms  = milliseconds
#MB = megabytes s   = seconds
#GB = gigabytes min = minutes
#TB = terabytes h   = hours
#   d   = days

and the units were copied when pg_size_pretty() was implemented.  These
units are based on the International System of Units (SI)/metric.
However, the SI system is power-of-10-based, and we just re-purposed
them to be 1024 or 2^10-based.

However, that is not the end of the story.  Things have moved forward
since 2006 and there is now firm support for either KB or KiB to be
1024-based units.  This blog post explains the current state of prefix
specification:


http://pchelp.ricmedia.com/kilobytes-megabytes-gigabytes-terabytes-explained/

and here is a summary for 1000/1024-based units:

Kilobyte (Binary, JEDEC)KB  1024
Kilobyte (Decimal, Metric)  kB  1000
Kibibyte (Binary, IEC)  KiB 1024

You will notice that none of these list kB as 1024, which explains this
bug report.

Yes, we have redefined kB, and documented its use in postgresql.conf and
pg_size_pretty(), but it does not match any recognized standard.

I am thinking Postgres 10 would be a good time to switch to KB as a
1024-based prefix.  Unfortunately, there is no similar fix for MB, GB,
etc.  'm' is 'milli' so there we never used mB, so in JEDEC and Metric,
MB is ambiguous as 1000-based or 1024-based.

IEC does give us a unique specification for 'mega', MiB, and GiB, which
might be what we want to use, but that might be too big a change, and I
rarely see those.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

2016-07-29 Thread Tomas Vondra

Hi,

On 07/29/2016 01:15 PM, Aleksander Alekseev wrote:

Hello

Some time ago we discussed an idea of "fast temporary tables":

https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu

In two words the idea is following.



PostgreSQL stores information about all relations in pg_catalog. Some
applications create and delete a lot of temporary tables. It causes a
bloating of pg_catalog and running auto vacuum on it. It's quite an
expensive operation which affects entire database performance.

We could introduce a new type of temporary tables. Information about
these tables is stored not in a catalog but in backend's memory. This
way user can solve a pg_catalog bloating problem and improve overall
database performance.




Great! Thanks for the patch, this is definitely an annoying issue worth 
fixing. I've spent a bit of time looking at the patch today, comments 
below ...




I took me a few months but eventually I made it work. Attached patch
has some flaws. I decided not to invest a lot of time in documenting
it or pgindent'ing all files yet. In my experience it will be rewritten
entirely 3 or 4 times before merging anyway :) But it _works_ and
passes all tests I could think of, including non-trivial cases like
index-only or bitmap scans of catalog tables.



Well, jokes aside, that's a pretty lousy excuse for not writing any 
docs, and you're pretty much asking the reviewers to reverse-engineer 
your reasoning. So I doubt you'll get many serious reviews without 
fixing this gap.



Usage example:

```
CREATE FAST TEMP TABLE fasttab_test1(x int, s text);

INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');

UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2;

DELETE FROM fasttab_test1 WHERE x = 3;

SELECT * FROM fasttab_test1 ORDER BY x;

DROP TABLE fasttab_test1;
```

More sophisticated examples could be find in regression tests:

./src/test/regress/sql/fast_temp.sql

Any feedback on this patch will be much appreciated!




1) I wonder whether the FAST makes sense - does this really change the 
performance significantly? IMHO you only move the catalog rows to 
memory, so why should the tables be any faster? I also believe this 
conflicts with SQL standard specification of CREATE TABLE.


2) Why do we need the new relpersistence value? ISTM we could easily got 
with just RELPERSISTENCE_TEMP, which would got right away of many 
chances as the steps are exactly the same.


IMHO if this patch gets in, we should use it as the only temp table 
implementation (Or can you think of cases where keeping rows in pg_class 
has advantages?). That'd also eliminate the need for FAST keyword in the 
CREATE TABLE command.


The one thin I'm not sure about is that our handling of temporary tables 
is not standard compliant - we require each session to create it's own 
private temporary table. Moving the rows from pg_class into backend 
memory seems to go in the opposite direction, but as no one was planning 
to fix this, I don't think it matters much.


3) I think the heapam/indexam/xact and various other places needs a 
major rework. You've mostly randomly sprinkled the code with function 
calls to make the patch work - that's fine for an initial version, but a 
more principled approach is needed.


4) I'm getting failures in the regression suite - apparently the patch 
somehow affects costing of index only scans, so that a several queries 
switch from index only scans to bitmap index scans etc. I haven't 
investigated this more closely, but it seems quite consistent (and I 
don't see it without the patch). It seems the patch delays building of 
visibility map, or something like that.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-29 Thread Michael Paquier
On Sat, Jul 30, 2016 at 1:17 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier
>>  wrote:
>>> While looking at the series of functions pg_get_*, I have noticed as
>>> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
>>> not know a user. Perhaps we'd want to change that to NULL for
>>> consistency with the rest?
>
>> That's probably correct in theory, but it's a little less closely
>> related, and I'm not entirely sure how far we want to go with this.
>> Remember, the original purpose was to avoid having an internal error
>> (cache lookup failed, XX000) exposed as a user-visible error message.
>> Are we at risk from veering from actual bug-fixing off into useless
>> tinkering?  Not sure.
>
> I'd vote for leaving that one alone; yeah, it's a bit inconsistent
> now, but no one has complained about its behavior.

OK for me. Thanks for the commit.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-29 Thread Greg Stark
On Fri, Jul 29, 2016 at 4:13 PM, Bruce Momjian  wrote:
> Yes, I am thinking of a case where Postgres is down but a malevolent
> user starts a Postgres server on 5432 to gather passwords.

Or someone spoofs your DNS lookup, which is an attack that can
actually be done remotely in some cases.

For what it's worth the SCRAM work also addresses precisely this
danger though it doesn't prevent the attacker from pretending to be a
real server and capturing private data from the SQL updates.

Even in the case where there's no known server certificate it could
save the fingerprint seen once and require it not change. This proves
to be a headache to manage though. It's equivalent to the SSH
known_hosts scheme. How many times have you seen that warning message
and just automatically removed the entry in known_hosts without
verifying...

One day DNSSEC will solve all these problems though. Then you'll just
store the certificate in the DNS entry for the server and the client
will insist it match.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3

2016-07-29 Thread Stephen Frost
Michael,

(dropping -general, not sure why that list was included...)

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Wed, Jul 27, 2016 at 8:07 AM, Stephen Frost  wrote:
> > That'd be great.  It's definitely on my list of things to look into, but
> > I'm extremely busy this week.  I hope to look into it on Friday, would
> > be great to see what you find.
> 
> Sequences that are directly defined in extensions do not get dumped,
> and sequences that are part of a serial column in an extension are
> getting dumped. Looking into this problem, getOwnedSeqs() is visibly
> doing an incorrect assumption: sequences owned by table columns are
> dumped unconditionally, but this is not true for sequences that are
> part of extensions. More precisely, dobj->dump is being enforced to
> DUMP_COMPONENT_ALL, which makes the sequence definition to be dumped.
> Oops.

Right, it should be set to the same value as the table which is being
dumped rather than DUMP_COMPONENT_ALL.  Moreover, the
owning_tab->dobj.dump check should explicitly check against
DUMP_COMPONENT_NONE, but that's just to be neat since that's '0'.

> The patch attached fixes the problem for me. I have added as well
> tests in test_pg_dump in the shape of sequences defined in an
> extension, and sequences that are part of a serial column. This patch
> is also able to work in the case where a sequence is created as part
> of a serial column, and gets removed after, say with ALTER EXTENSION
> DROP SEQUENCE. The behavior for sequences and serial columns that are
> not part of extensions is unchanged.

This isn't correct as the dump components which are independent of the
extension (ACLs, security labels, policies) won't be dumped.

Consider, for example, what happens if the user runs:

GRANT USAGE ON SEQUENCE  TO role;

This wouldn't get dumped out with your patch, since we would decide that,
because the sequence is a member of the extension, that nothing about it
should be dumped, which isn't correct.

> Stephen, it would be good if you could check the correctness of this
> patch as you did all this refactoring of pg_dump to support catalog
> ACLs. I am sure by the way that checking for (owning_tab->dobj.dump &&
> DUMP_COMPONENT_DEFINITION) != 0 is not good because of for example the
> case of a serial column created in an extension where the sequence is
> dropped from the extension afterwards.

If the sequence is dropped from the extension then the sequence will be
ignored by checkExtensionMembership() during selectDumpableObject() and
the regular selectDumpableObject() code will handle marking the sequence
appropriately.

What we need to be doing here is combining the set of components that
the sequence has been marked with and the set of components that the
table has been marked with, not trying to figure out if the sequence is
a member of an extension or not and changing what to do in those cases,
that's checkExtensionMembership()'s job, really.

Attached is a patch implementing this and which passes the regression
tests you added and a couple that I added for the non-extension case.
I'm going to look at adding a few more regression tests and if I don't
come across any issues then I'll likely push the fix sometime tomorrow.

Comments welcome, of course.

Thanks!

Stephen
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
new file mode 100644
index 08c2b0c..333e628
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*** getOwnedSeqs(Archive *fout, TableInfo tb
*** 6035,6048 
  
  		if (!OidIsValid(seqinfo->owning_tab))
  			continue;			/* not an owned sequence */
! 		if (seqinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
! 			continue;			/* no need to search */
  		owning_tab = findTableByOid(seqinfo->owning_tab);
! 		if (owning_tab && owning_tab->dobj.dump)
! 		{
  			seqinfo->interesting = true;
- 			seqinfo->dobj.dump = DUMP_COMPONENT_ALL;
- 		}
  	}
  }
  
--- 6035,6061 
  
  		if (!OidIsValid(seqinfo->owning_tab))
  			continue;			/* not an owned sequence */
! 
  		owning_tab = findTableByOid(seqinfo->owning_tab);
! 
! 		/*
! 		 * We need to dump the components that are being dumped for the table
! 		 * and any components which the sequence is explicitly marked with.
! 		 *
! 		 * We can't simply use the set of components which are being dumped for
! 		 * the table as the table might be in an extension (and only the
! 		 * non-extension components, eg: ACLs if changed, security labels, and
! 		 * policies, are being dumped) while the sequence is not (and therefore
! 		 * the definition and other components should also be dumped).
! 		 *
! 		 * If the sequence is part of the extension then it should be properly
! 		 * marked by checkExtensionMembership() and this will be a no-op as the
! 		 * table will be equivilantly marked.
! 		 */
! 		seqinfo->dobj.dump = seqinfo->dobj.dump | owning_tab->dobj.dump;
! 
! 		if (seqinfo->dobj.dump != DUMP_COMPONENT_NONE)
  			

Re: [HACKERS] to_date_valid()

2016-07-29 Thread Jim Nasby

On 7/29/16 1:33 PM, Andreas 'ads' Scherbaum wrote:

On 27.07.2016 05:00, Joshua D. Drake wrote:

On 07/26/2016 06:25 PM, Peter Eisentraut wrote:

On 7/5/16 4:24 AM, Albe Laurenz wrote:

But notwithstanding your feeling that you would like your application
to break if it makes use of this behaviour, it is a change that might
make some people pretty unhappy - nobody can tell how many.


What is the use of the existing behavior?  You get back an arbitrary
implementation dependent value.  We don't even guarantee what the value
will be.  If we changed it to return a different implementation
dependent value, would users get upset?


No they would not get upset because they wouldn't know.

Can we just do the right thing?


I'm in favour of fixing this, and update the documentation.


+1. I'd say that if users complain we can always create an extension (on 
PGXN) that offers the old behavior. Users could even put that function 
before pg_catalog in search_path and get the old behavior back.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Design for In-Core Logical Replication

2016-07-29 Thread Alvaro Herrera
Robert Haas wrote:

> One minor comment is that this document makes extensive use of Terms
> With Initial Capitals, which seems stylistically odd, although I'm not
> precisely sure what would be better.

We use publication on the first use only, which is turned
into italics.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated

2016-07-29 Thread Peter Eisentraut
On 7/29/16 3:13 PM, David Fetter wrote:
> I expect this kind of blather from MySQL, but you've brought up
> something that's been bothering me for awhile.  PostgreSQL's response
> should look more like this:
> 
> ERROR:  month field value out of range: "2016-99-99"
> LINE 1: select cast('2016-99-99' as date);
>   ^
> Any idea how much effort that would be?

This particular case is probably not hard, but the problem is that that
would raise the bar about error pointer precision, and you then should
also update a bunch of other places to give similar precision.  That
could be a lot of work.

I am, however, of the opinion, that these kinds of things can never be
helpful enough.  The latest trend is start and end pointers, which would
be nice.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Performance of tqueue.c's tuple remapping logic

2016-07-29 Thread Tom Lane
$SUBJECT sucks.

Create a table containing lots of composite arrays:

regression=# create table foo (f1 int8_tbl[]);
CREATE TABLE
regression=# insert into foo select array[row(1,2),row(3,4)]::int8_tbl[] from 
generate_series (1,1000);
INSERT 0 1000
regression=# vacuum analyze foo;
VACUUM

Establish a baseline for how long it takes to scan this table:

regression=# explain analyze select f1 from foo;
 QUERY PLAN 
 
-
 Seq Scan on foo  (cost=0.00..263935.06 rows=1006 width=101) (actual 
time=0.027..1461.236 rows=1000 loops=1)
 Planning time: 0.149 ms
 Execution time: 1996.995 ms
(3 rows)

... or select a non-composite value out of it:

regression=# explain analyze select f1[1].q1 from foo;
QUERY PLAN  
   
---
 Seq Scan on foo  (cost=0.00..263935.06 rows=1006 width=8) (actual 
time=1.122..3736.121 rows=1000 loops=1)
 Planning time: 0.077 ms
 Execution time: 4285.872 ms
(3 rows)

Now let's try those same queries in parallel mode:

regression=# set force_parallel_mode to 1;
SET
regression=# explain analyze select f1[1].q1 from foo;
   QUERY PLAN   
 
-
 Gather  (cost=1000.00..1264935.66 rows=1006 width=8) (actual 
time=11.402..12753.782 rows=1000 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Seq Scan on foo  (cost=0.00..263935.06 rows=1006 width=8) (actual 
time=0.182..4523.724 rows=1000 loops=1)
 Planning time: 0.081 ms
 Execution time: 13736.321 ms
(7 rows)

regression=# explain analyze select f1 from foo;
QUERY PLAN  
   
---
 Gather  (cost=1000.00..1264935.66 rows=1006 width=101) (actual 
time=6.659..22693.798 rows=1000 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Seq Scan on foo  (cost=0.00..263935.06 rows=1006 width=101) (actual 
time=0.780..2579.707 rows=1000 loops=1)
 Planning time: 0.073 ms
 Execution time: 25925.709 ms
(7 rows)

So, having to do record-type remapping in the fairly trivial case of
two-element composite arrays nearly doubles the already rather excessive
runtime for a parallel query returning lots of data.

Just to add insult to injury, the backend's memory consumption bloats
to something over 5.5G during that last query.  Which is not terribly
surprising given all the cavalier use of TopMemoryContext in tqueue.c.

Since the entire point of parallel query is to deal with large data
volumes, I think this means that tqueue.c is simply unfit for production
use as it stands.

I propose to undertake a thorough code review, and possibly significant
rewrite, of tqueue.c over the next few days.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated

2016-07-29 Thread David Fetter
On Fri, Jul 29, 2016 at 09:20:12PM +0300, Nikolay Samokhvalov wrote:
> On Fri, Jul 29, 2016 at 5:28 PM, Jim Nasby  wrote:
> >
> > The way I sum up MySQL vs PG for people that ask is to recount how they
> > "fixed" the Feb. 31st bug when they released strict mode (something that
> > they actually called out in the release PR). With strict mode enabled, Feb.
> > 30th and 31st would give you an error. Feb 35th was still silently
> > converted to March whatever. *That was the MySQL mentality: data quality
> > doesn't matter compared to "ease of use".*
> >
> > They've done this throughout their history... when presented with a hard
> > problem, they skip around it or plaster over it, and then they promote that
> > their solution is the only right way to solve the problem. (Their docs
> > actually used to say that anything other that table-level locking was a bad
> > idea.)
> 
> 
> This is exactly what I mean saying MySQL speaks different language than I
> know, and that's why I simply cannot use it:
> 
> (mysql 5.7.12)
> 
> mysql> select cast('2016-99-99' as date);
> ++
> | cast('2016-99-99' as date) |
> ++
> | NULL   |
> ++
> 1 row in set, 1 warning (0.00 sec)
> 
> 
> In Postgres:
> 
> test=#  select cast('2016-99-99' as date);
> ERROR:  date/time field value out of range: "2016-99-99"
> LINE 1: select cast('2016-99-99' as date);
> ^

I expect this kind of blather from MySQL, but you've brought up
something that's been bothering me for awhile.  PostgreSQL's response
should look more like this:

ERROR:  month field value out of range: "2016-99-99"
LINE 1: select cast('2016-99-99' as date);
  ^
Any idea how much effort that would be?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-29 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, Jul 29, 2016 at 10:44:29AM -0400, Stephen Frost wrote:
> > Another thought that was kicking around in my head related to that is if
> > we could have indexes that only provide page-level information (similar
> > to BRIN, but maybe a btree) and which also would allow HOT updates.
> > Those indexes would typically be used in a bitmap index scan where we're
> > going to be doing a bitmap heap scan with a recheck, of course, though I
> > wonder if we could come up with a way to do an in-order bitmap index
> > scan where we sort the tuples on the page and then perform some kind of
> > mergejoin recheck (or just pull out whatever the lowest-not-seen each
> > time we sort the tuples on the page).
> 
> So allow HOT updates if the updated row is on the same page, even if the
> indexed column was changed, by scanning the page --- got it.

The idea I had was to allow creation of indexes which *only* include the
page ID.  Your rephrase seems to imply that we'd have a regular index
but then be able to figure out if a given tuple had any HOT updates
performed on it and, if so, scan the entire page.  As I understand it,
it's more complicated than that because we must involve an index when
updating a tuple in some cases (UNIQUE?) and therefore we don't perform
HOT in the case where any indexed column is being changed.

Of course, this only works if these page-level indexes don't support the
features that prevent HOT updates today.  If we can tell which existing
indexes have been built in a such a way to prevent HOT updates and which
would work with a HOT updated tuple, then perhaps we could change the
HOT code to check that when it's considering if a tuple can be updated
using HOT or not and not have only specific indexes able to support HOT
updated tuples.

This is clearly all hand-wavy, but if the the BRIN indexes could work in
this way then it seems like we should be able to generalize what it is
about BRIN that allows it and provide a way for other kinds of indexes
to support tuples being HOT updated.  It appears to be clearly useful in
some use-cases.  That's really what I was trying to get at.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] to_date_valid()

2016-07-29 Thread Andreas 'ads' Scherbaum

On 27.07.2016 05:00, Joshua D. Drake wrote:

On 07/26/2016 06:25 PM, Peter Eisentraut wrote:

On 7/5/16 4:24 AM, Albe Laurenz wrote:

But notwithstanding your feeling that you would like your application
to break if it makes use of this behaviour, it is a change that might
make some people pretty unhappy - nobody can tell how many.


What is the use of the existing behavior?  You get back an arbitrary
implementation dependent value.  We don't even guarantee what the value
will be.  If we changed it to return a different implementation
dependent value, would users get upset?


No they would not get upset because they wouldn't know.

Can we just do the right thing?


I'm in favour of fixing this, and update the documentation. But given 
the discussions in the past, it seemed like people actually depend on 
this behaviour. Hence the additional function.


if this is fixed, it's too late for the current beta. But it's a good 
time to add a note in the release notes, and advise people that it will 
be changed in the next release.



A workaround can be to rename the current function to something like 
"to_date_legacy", or "to_date_oracle". And implement the checks in to_date.


--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated

2016-07-29 Thread Nikolay Samokhvalov
On Fri, Jul 29, 2016 at 5:28 PM, Jim Nasby  wrote:
>
> The way I sum up MySQL vs PG for people that ask is to recount how they
> "fixed" the Feb. 31st bug when they released strict mode (something that
> they actually called out in the release PR). With strict mode enabled, Feb.
> 30th and 31st would give you an error. Feb 35th was still silently
> converted to March whatever. *That was the MySQL mentality: data quality
> doesn't matter compared to "ease of use".*
>
> They've done this throughout their history... when presented with a hard
> problem, they skip around it or plaster over it, and then they promote that
> their solution is the only right way to solve the problem. (Their docs
> actually used to say that anything other that table-level locking was a bad
> idea.)


This is exactly what I mean saying MySQL speaks different language than I
know, and that's why I simply cannot use it:

(mysql 5.7.12)

mysql> select cast('2016-99-99' as date);
++
| cast('2016-99-99' as date) |
++
| NULL   |
++
1 row in set, 1 warning (0.00 sec)


In Postgres:

test=#  select cast('2016-99-99' as date);
ERROR:  date/time field value out of range: "2016-99-99"
LINE 1: select cast('2016-99-99' as date);
^


Re: [HACKERS] Wrong defeinition of pq_putmessage_noblock since 9.5

2016-07-29 Thread Tom Lane
I wrote:
> Kyotaro HORIGUCHI  writes:
>> Any work in this area is likely 10.0 material at this point.

> I'm not really happy with that, since refactoring it again will create
> back-patch hazards.  But I see that a lot of the mess here was created
> in 9.5, which means we're probably stuck with back-patch hazards anyway.

I've pushed Kyotaro-san's original patch, which is clearly a bug fix.
I think the additional changes discussed later in this thread are
cosmetic, and probably should wait for a more general review of the
layering decisions in pqcomm.c.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-29 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier
>  wrote:
>> While looking at the series of functions pg_get_*, I have noticed as
>> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
>> not know a user. Perhaps we'd want to change that to NULL for
>> consistency with the rest?

> That's probably correct in theory, but it's a little less closely
> related, and I'm not entirely sure how far we want to go with this.
> Remember, the original purpose was to avoid having an internal error
> (cache lookup failed, XX000) exposed as a user-visible error message.
> Are we at risk from veering from actual bug-fixing off into useless
> tinkering?  Not sure.

I'd vote for leaving that one alone; yeah, it's a bit inconsistent
now, but no one has complained about its behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3

2016-07-29 Thread Robert Haas
On Wed, Jul 27, 2016 at 2:24 AM, Michael Paquier
 wrote:
> On Wed, Jul 27, 2016 at 8:07 AM, Stephen Frost  wrote:
>> That'd be great.  It's definitely on my list of things to look into, but
>> I'm extremely busy this week.  I hope to look into it on Friday, would
>> be great to see what you find.
>
> Sequences that are directly defined in extensions do not get dumped,
> and sequences that are part of a serial column in an extension are
> getting dumped. Looking into this problem, getOwnedSeqs() is visibly
> doing an incorrect assumption: sequences owned by table columns are
> dumped unconditionally, but this is not true for sequences that are
> part of extensions. More precisely, dobj->dump is being enforced to
> DUMP_COMPONENT_ALL, which makes the sequence definition to be dumped.
> Oops.
>
> The patch attached fixes the problem for me. I have added as well
> tests in test_pg_dump in the shape of sequences defined in an
> extension, and sequences that are part of a serial column. This patch
> is also able to work in the case where a sequence is created as part
> of a serial column, and gets removed after, say with ALTER EXTENSION
> DROP SEQUENCE. The behavior for sequences and serial columns that are
> not part of extensions is unchanged.
>
> Stephen, it would be good if you could check the correctness of this
> patch as you did all this refactoring of pg_dump to support catalog
> ACLs. I am sure by the way that checking for (owning_tab->dobj.dump &&
> DUMP_COMPONENT_DEFINITION) != 0 is not good because of for example the
> case of a serial column created in an extension where the sequence is
> dropped from the extension afterwards.

Stephen, is this still on your list of things for today?  Please
provide a status update.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-07-29 Thread Robert Haas
On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier
 wrote:
> On Wed, Jul 27, 2016 at 5:11 AM, Robert Haas  wrote:
>> Committed with minor kibitizing: you don't need an "else" after a
>> statement that transfers control out of the function.
>
> Thanks. Right, I forgot that.
>
>> Shouldn't
>> pg_get_function_arguments, pg_get_function_identity_arguments,
>> pg_get_function_result, and pg_get_function_arg_default get the same
>> treatment?
>
> Changing all of them make sense. Please see attached.

Committed.

> While looking at the series of functions pg_get_*, I have noticed as
> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does
> not know a user. Perhaps we'd want to change that to NULL for
> consistency with the rest?

That's probably correct in theory, but it's a little less closely
related, and I'm not entirely sure how far we want to go with this.
Remember, the original purpose was to avoid having an internal error
(cache lookup failed, XX000) exposed as a user-visible error message.
Are we at risk from veering from actual bug-fixing off into useless
tinkering?  Not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wrong defeinition of pq_putmessage_noblock since 9.5

2016-07-29 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> At Fri, 29 Jul 2016 12:47:53 +0900, Michael Paquier 
>  wrote in 
> 

Re: [HACKERS] Design for In-Core Logical Replication

2016-07-29 Thread Robert Haas
On Wed, Jul 20, 2016 at 4:08 AM, Simon Riggs  wrote:
> In this post, Petr and I present a joint view on a design for how this
> should work in-core, based upon our implementation experiences with physical
> replication, pglogical and various comments so far.
>
> Note that this has substantial user-visible differences from pglogical,
> though much of the underlying architecture is reused.
>
> I should stress that not all of the aspects are implemented yet. The post
> here today is a combination of all of our attempts to bring architecture,
> usability and security into one place, including a coherent way of
> describing the features and how they work.
>
> Your comments and questions are sought now as we begin the main development
> effort to get this into PostgreSQL 10.0

Thanks for publishing this.

One minor comment is that this document makes extensive use of Terms
With Initial Capitals, which seems stylistically odd, although I'm not
precisely sure what would be better.

I would have expected that there would be a concept of a REPLICATION
SET, defining which tables are to be replicated; here, that seems to
be the Publication.  That may fine, but I wonder if there is any value
in separating those things.  It's clear, for example, that a
replication set can be dumped: which tables are members of which
replication sets is durable metadata.  It's less clear that a
publication can be dumped; that might include things which are not
durable metadata, such as associations with slots.

It's generally not really clear to me based on reading this exactly
what information is encapsulated in a Publication or a Subscription,
which makes it hard to evaluate design decisions like this one:

>   
> The definition of a Publication object will be included within
> pg_dump by default when all of the objects in the Publication are
> requested as part of the dump specification.
>   
>   
> Subscriptions are not dumped by pg_dump by default, but can be
> requested using --subscriptions parameter.
>   

I think that to really understand exactly what you and Petr have in
mind, we'd need a description of where publication and subscription
data is stored within the server, and exactly what gets stored.
Perhaps that will come in a later email.  I'm not bashing the design,
exactly, I just can't quite see how all of the pieces fit together
yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-29 Thread Peter Eisentraut
On 7/29/16 11:13 AM, Bruce Momjian wrote:
> Yes, I am thinking of a case where Postgres is down but a malevolent
> user starts a Postgres server on 5432 to gather passwords.  Verifying
> against an SSL certificate would avoid this problem, so there is some
> value in using SSL on localhost.  (There is no such security available
> for Unix-domain socket connections.)

Sure, there is the requirepeer connection option for that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-29 Thread Bruce Momjian
On Fri, Jul 29, 2016 at 10:44:29AM -0400, Stephen Frost wrote:
> Another thought that was kicking around in my head related to that is if
> we could have indexes that only provide page-level information (similar
> to BRIN, but maybe a btree) and which also would allow HOT updates.
> Those indexes would typically be used in a bitmap index scan where we're
> going to be doing a bitmap heap scan with a recheck, of course, though I
> wonder if we could come up with a way to do an in-order bitmap index
> scan where we sort the tuples on the page and then perform some kind of
> mergejoin recheck (or just pull out whatever the lowest-not-seen each
> time we sort the tuples on the page).

So allow HOT updates if the updated row is on the same page, even if the
indexed column was changed, by scanning the page --- got it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-29 Thread Bruce Momjian
On Tue, Jul 19, 2016 at 03:24:26PM -0400, Peter Eisentraut wrote:
> On 7/19/16 10:00 AM, Magnus Hagander wrote:
> > What could actually be useful there is to explicitly put hostnossl on
> > the localhost entries. With the current defaults on the clients, that
> > wouldn't break anything, and it would leave people without the
> > performance issues that you run into in the default deployments. And for
> > localhost it really does't make sense to encrypt -- for the local LAN
> > segment that can be argued, but for localhost...
> 
> But even on localhost you ideally want a way to confirm that the server
> you are connecting to is the right one, so you might want certificates.
> Plus the server might want certificates from the clients.  (See also the
> occasional discussion about supporting SSL over Unix-domain sockets.)

Yes, I am thinking of a case where Postgres is down but a malevolent
user starts a Postgres server on 5432 to gather passwords.  Verifying
against an SSL certificate would avoid this problem, so there is some
value in using SSL on localhost.  (There is no such security available
for Unix-domain socket connections.)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix comment in ATExecValidateConstraint

2016-07-29 Thread Robert Haas
On Mon, Jul 25, 2016 at 4:18 AM, Amit Langote
 wrote:
> The comment seems to have been copied from ATExecAddColumn, which says:
>
>  /*
>   * If we are told not to recurse, there had better not be any
> - * child tables; else the addition would put them out of step.
>
> For ATExecValidateConstraint, it should say something like:
>
> + * child tables; else validating the constraint would put them
> + * out of step.
>
> Attached patch fixes it.

I agree that the current comment is wrong, but what does "out of step"
actually mean here, anyway?  I think this isn't very clear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated

2016-07-29 Thread Jim Nasby

On 7/29/16 8:17 AM, Kevin Grittner wrote:

On Thu, Jul 28, 2016 at 8:39 PM, Tatsuo Ishii  wrote:


> BTW, is there any opposite information, i.e. showing the
> limitation of MySQL comparing with PostgreSQL?

I'm not aware of a general list on the topic, but in reviewing
academic papers regarding transaction isolation I did find (and
confirm) that MySQL InnoDB relaxes the "strict" aspect of the
Strict 2 Phase Locking they use for implementing serializable
transactions.  "For performance reasons" they drop the locks

...

The way I sum up MySQL vs PG for people that ask is to recount how they 
"fixed" the Feb. 31st bug when they released strict mode (something that 
they actually called out in the release PR). With strict mode enabled, 
Feb. 30th and 31st would give you an error. Feb 35th was still silently 
converted to March whatever. *That was the MySQL mentality: data quality 
doesn't matter compared to "ease of use".*


They've done this throughout their history... when presented with a hard 
problem, they skip around it or plaster over it, and then they promote 
that their solution is the only right way to solve the problem. (Their 
docs actually used to say that anything other that table-level locking 
was a bad idea.)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-29 Thread Stephen Frost
* Jim Nasby (jim.na...@bluetreble.com) wrote:
> On 7/28/16 10:05 AM, Alex Ignatov wrote:
> >>Just curious: what if PostgreSQL supported index that stores "primary
> >>key" (or unique key) instead of tids?
> >
> >You mean IOT like Oracle have?
> 
> IIRC, IOT either stores the table in index order, which is something
> different.

IOT is definitely an interesting idea that I'd like to see us pursue,
but I agree that it's something different.

> What Alex is proposing is an index method that stores a datum
> instead of a ctid. You would then use that datum to probe a
> different index to get the ctid. Or put simply, you have a PK index
> that contains ctid's, and a bunch of other indexes that contain a PK
> value instead of ctid's.

Right, that's the MySQL approach, which has advantages and
disadvantages.

> I think it's an idea worth pursuing, but I don't see how you can
> make it work with our MVCC system unless we drop the aversion to
> scanning back into an index as part of an update.

I'm not terribly excited about the MySQL approach, personally, but I
really like the idea of trying to make HOT updates smarter and allow HOT
updates for indexes which don't include TIDs, as Robert and Alvaro are
discussing.

Another thought that was kicking around in my head related to that is if
we could have indexes that only provide page-level information (similar
to BRIN, but maybe a btree) and which also would allow HOT updates.
Those indexes would typically be used in a bitmap index scan where we're
going to be doing a bitmap heap scan with a recheck, of course, though I
wonder if we could come up with a way to do an in-order bitmap index
scan where we sort the tuples on the page and then perform some kind of
mergejoin recheck (or just pull out whatever the lowest-not-seen each
time we sort the tuples on the page).

All very hand-wavy, of course, and it'd make sense to make the concept
work for BRIN before we consider anything else, but it seems like there
could be a use-case for allowing indexes other than BRIN to be built in
a way that allows HOT updates to happen, thus eliminating the cost of
having to update those indexes when the tuple is changed, in many cases.
Of course, those indexes couldn't be used UNIQUE indexes or used for
primary keys, and adjusting the parameters to a BRIN index you could
possibly get a similar index, but this might allow such an index to
still be usable for index-only scans, which a BRIN index will never be
able to provide.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why we lost Uber as a user

2016-07-29 Thread Jim Nasby

On 7/28/16 10:05 AM, Alex Ignatov wrote:

Just curious: what if PostgreSQL supported index that stores "primary
key" (or unique key) instead of tids?


You mean IOT like Oracle have?


IIRC, IOT either stores the table in index order, which is something 
different.


What Alex is proposing is an index method that stores a datum instead of 
a ctid. You would then use that datum to probe a different index to get 
the ctid. Or put simply, you have a PK index that contains ctid's, and a 
bunch of other indexes that contain a PK value instead of ctid's.


I think it's an idea worth pursuing, but I don't see how you can make it 
work with our MVCC system unless we drop the aversion to scanning back 
into an index as part of an update.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup wish list

2016-07-29 Thread Amit Kapila
On Thu, Jul 28, 2016 at 7:34 PM, Fujii Masao  wrote:
> On Thu, Jul 28, 2016 at 10:16 PM, Amit Kapila  wrote:
>
>> I think there is some value in providing
>> .tar for -Z 0,
>
> I was thinking that "-Ft -Z0" is something like an alias of "-Ft".
> That is, the backup is taken in uncompressed tar format.
>
>> however in that case how should we define usage of -F p
>> -Z 0?  Shall we say with plain format -Z 0 gets ignored or throw error
>> or do something else?  If first, then I think it is better to mention
>> the same in docs.
>
> ISTM that it's better to ignore that case, like pg_dump -Ft -Z0
> doesn't throw an error.
>

Okay, then you can go ahead with your patch.



--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated

2016-07-29 Thread Kevin Grittner
On Thu, Jul 28, 2016 at 8:39 PM, Tatsuo Ishii  wrote:

> BTW, is there any opposite information, i.e. showing the
> limitation of MySQL comparing with PostgreSQL?

I'm not aware of a general list on the topic, but in reviewing
academic papers regarding transaction isolation I did find (and
confirm) that MySQL InnoDB relaxes the "strict" aspect of the
Strict 2 Phase Locking they use for implementing serializable
transactions.  "For performance reasons" they drop the locks
acquired during the transaction *before* ensuring crash/recovery
persistence.  This is more-or-less equivalent to always running
with synchronous_commit = off as well as allowing a small window
for serialization anomalies in corner cases.  The PostgreSQL
synchronous_commit option allows a similar performance benefit
(where the trade-off is deemed justified) without risking data
integrity in the same way.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-29 Thread Andrew Borodin
I've tested patch with this query
https://gist.github.com/x4m/fee16ed1a55217528f036983d88771b4
Test specs were: Ubuntu 14 LTS VM, dynamic RAM, hypervisor Windows
Server 2016, SSD disk, core i5-2500. Configuration: out of the box
master make.

On 10 test runs timing of select statement: AVG 3739.8 ms, STD  435.4193
With patch applied (as is) : 3017.8 ms, STD 319.893

Increase of overflow const showed no statistically viable performance
improvement (though, I do not worth doing).

2016-07-27 17:32 GMT+05:00 Tom Lane :
> For that matter, spelling INT_MAX as 0x7FF is also not per project style.

Sure, 0x7FF was not for code but for metal arithmetics. I would
even add that INT_MAX is system-dependent and varies in different
specs. I'd suggest INT32_MAX.

Best regards, Andrey Borodin, Octonica & Ural Federal University.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

2016-07-29 Thread Aleksander Alekseev
Hello

Some time ago we discussed an idea of "fast temporary tables":

https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu

In two words the idea is following.



PostgreSQL stores information about all relations in pg_catalog. Some
applications create and delete a lot of temporary tables. It causes a
bloating of pg_catalog and running auto vacuum on it. It's quite an
expensive operation which affects entire database performance.

We could introduce a new type of temporary tables. Information about
these tables is stored not in a catalog but in backend's memory. This
way user can solve a pg_catalog bloating problem and improve overall
database performance.



I took me a few months but eventually I made it work. Attached patch
has some flaws. I decided not to invest a lot of time in documenting
it or pgindent'ing all files yet. In my experience it will be rewritten
entirely 3 or 4 times before merging anyway :) But it _works_ and
passes all tests I could think of, including non-trivial cases like
index-only or bitmap scans of catalog tables.

Usage example:

```
CREATE FAST TEMP TABLE fasttab_test1(x int, s text);

INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');

UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2;

DELETE FROM fasttab_test1 WHERE x = 3;

SELECT * FROM fasttab_test1 ORDER BY x;

DROP TABLE fasttab_test1;
```

More sophisticated examples could be find in regression tests:

./src/test/regress/sql/fast_temp.sql

Any feedback on this patch will be much appreciated!

-- 
Best regards,
Aleksander Alekseev
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 0689cc9..940210b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1693,7 +1693,7 @@
   
   
p = permanent table, u = unlogged table,
-   t = temporary table
+   t = temporary table, f = fast temporary table
   
  
 
diff --git a/src/backend/access/common/Makefile b/src/backend/access/common/Makefile
index 1fa6de0..56de4dc 100644
--- a/src/backend/access/common/Makefile
+++ b/src/backend/access/common/Makefile
@@ -12,7 +12,7 @@ subdir = src/backend/access/common
 top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = heaptuple.o indextuple.o printtup.o reloptions.o scankey.o \
+OBJS = fasttab.o heaptuple.o indextuple.o printtup.o reloptions.o scankey.o \
 	tupconvert.o tupdesc.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/access/common/fasttab.c b/src/backend/access/common/fasttab.c
new file mode 100644
index 000..0a20247
--- /dev/null
+++ b/src/backend/access/common/fasttab.c
@@ -0,0 +1,1678 @@
+/* TODO TODO comment the general idea - in-memory tuples and indexes, hooks principle, FasttabSnapshots, etc */
+
+#include "c.h"
+#include "postgres.h"
+#include "pgstat.h"
+#include "miscadmin.h"
+#include "access/amapi.h"
+#include "access/fasttab.h"
+#include "access/relscan.h"
+#include "access/valid.h"
+#include "access/sysattr.h"
+#include "access/htup_details.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_depend.h"
+#include "catalog/pg_inherits.h"
+#include "catalog/pg_statistic.h"
+#include "storage/bufmgr.h"
+#include "utils/rel.h"
+#include "utils/inval.h"
+#include "utils/memutils.h"
+
+/*
+		  TYPEDEFS, MACRO DECLARATIONS AND CONST STATIC VARIABLES
+ */
+
+/* #define FASTTAB_DEBUG 1 */
+
+#ifdef FASTTAB_DEBUG
+static int32 fasttab_scan_tuples_counter = -1;
+#endif
+
+/* list of in-memory catalog tuples */
+typedef struct
+{
+	dlist_node	node;
+	HeapTuple	tup;
+}	DListHeapTupleData;
+
+typedef DListHeapTupleData *DListHeapTuple;
+
+#define FasttabSnapshotIsAnonymous(sn) ( !PointerIsValid((sn)->name) )
+#define FasttabSnapshotIsRoot(sn) ( !PointerIsValid((sn)->prev) )
+#define FasttabTransactionInProgress() \
+	( PointerIsValid(FasttabSnapshotGetCurrent()->prev))
+/* like strcmp but for integer types --- int, uint32, Oid, etc */
+#define FasttabCompareInts(x, y) ( (x) == (y) ? 0 : ( (x) > (y) ? 1 : -1 ))
+
+struct FasttabSnapshotData;		/* forward declaration required for
+ * relation_is_inmem_tuple_function typedef */
+typedef struct FasttabSnapshotData *FasttabSnapshot;
+
+typedef bool (*relation_is_inmem_tuple_function)
+			(Relation relation, HeapTuple tup, FasttabSnapshot fasttab_snapshot,
+		 int tableIdx);
+
+#define FasttabRelationMaxOidAttributes 2
+
+typedef const struct
+{
+	Oid			relationId;
+	relation_is_inmem_tuple_function is_inmem_tuple_fn;
+	AttrNumber	noidattr;
+	AttrNumber	attrNumbers[FasttabRelationMaxOidAttributes];
+}	FasttabRelationMethodsData;
+
+typedef FasttabRelationMethodsData const *FasttabRelationMethods;
+
+static bool generic_is_inmem_tuple(Relation relation, HeapTuple tup,
+	   FasttabSnapshot fasttab_snapshot, int tableIdx);

Re: [HACKERS] System load consideration before spawning parallel workers

2016-07-29 Thread Amit Kapila
On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi
 wrote:
> we observed that spawning the specified number of parallel workers for
> every query that satisfies for parallelism is sometimes leading to
> performance drop compared to improvement during the peak system load
> with other processes. Adding more processes to the system is leading
> to more context switches thus it reducing the performance of other SQL
> operations.
>

Have you consider to tune using max_worker_processes, basically I
think even if you have kept the moderate value for
max_parallel_workers_per_gather, the number of processes might
increase if total number allowed is much bigger.

Are the total number of parallel workers more than number of
CPU's/cores in the system? If yes, I think that might be one reason
for seeing performance degradation.

> In order to avoid this problem, how about adding some kind of system
> load consideration into account before spawning the parallel workers?
>

Hook could be a possibility, but not sure how users are going to
decide the number of parallel workers, there might be other backends
as well which can consume resources.  I think we might need some form
of throttling w.r.t assignment of parallel workers to avoid system
overload.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL logging problem in 9.4.3?

2016-07-29 Thread Michael Paquier
On Thu, Jul 28, 2016 at 4:59 PM, Michael Paquier
 wrote:
> On Wed, Apr 6, 2016 at 3:11 PM, Michael Paquier
>  wrote:
>> On Wed, Mar 23, 2016 at 12:45 PM, Michael Paquier
>>  wrote:
>>> On Wed, Mar 23, 2016 at 11:11 AM, David Steele  wrote:
 I would prefer not to bump it to the next CF unless we decide this will
 not get fixed for 9.6.
>>>
>>> It may make sense to add that to the list of open items for 9.6
>>> instead. That's not a feature.
>>
>> So I have moved this patch to the next CF for now, and will work on
>> fixing it rather soonishly as an effort to stabilize 9.6 as well as
>> back-branches.
>
> Well, not that soon at the end, but I am back on that... I have not
> completely reviewed all the code yet, and the case of index relation
> referring to a relation optimized with truncate is still broken, but
> for now here is a rebased patch if people are interested. I am going
> to get as well a TAP tests out of my pocket to ease testing.

The patch I sent yesterday was based on an incorrect version. Attached
is a slightly-modified version of the last one I found here
(https://www.postgresql.org/message-id/56b342f5.1050...@iki.fi), which
is rebased on HEAD at ed0b228. I have also converted the test case
script of upthread into a TAP test in src/test/recovery that covers 3
cases and I included that in the patch:
1) CREATE + INSERT + COPY => crash
2) CREATE + trigger + COPY => crash
3) CREATE + TRUNCATE + COPY => incorrect number of rows.
The first two tests make the system crash, the third one reports an
incorrect number of rows.

This is registered in next CF by the way:
https://commitfest.postgresql.org/10/528/
Thoughts?
-- 
Michael


fix-wal-level-minimal-michael-2.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers