Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
2016-07-30 3:47 GMT+02:00 David G. Johnston : > On Fri, Jul 29, 2016 at 8:18 PM, Bruce Momjian wrote: > >> On Tue, Jul 12, 2016 at 01:36:38PM +, thomas.ber...@1und1.de wrote: >> > The following bug has been logged on the website: >> > >> > Bug reference: 14244 >> > Logged by: Thomas Berger >> > Email address: thomas.ber...@1und1.de >> > PostgreSQL version: 9.5.3 >> > Operating system: any >> > Description: >> > >> > pg_size_pretty uses the suffix "kB" (kilobyte, 10^3 byte), but the >> returned >> > value is "KB", or "KiB" ( kibibyte, 2^10 byte). This is missleading and >> > should be fixed. See also https://en.wikipedia.org/wiki/Kibibyte >> > >> > =# select pg_size_pretty(1024000::bigint); >> > pg_size_pretty >> > >> > 1000 kB >> >> (Thread moved to hackers.) >> >> Yes, we have redefined kB, and documented its use in postgresql.conf and >> pg_size_pretty(), but it does not match any recognized standard. >> > > After bouncing on this for a bit I'm inclined to mark the bug itself > "won't fix" but introduce a "to_binary_iso" function (I'm hopeful a better > name will emerge...) that will output a number using ISO binary suffixes. > I would document this under 9.8 "data type formatting functions" instead of > within system functions. > > pg_size_pretty output can continue with a defined role to be used as input > into a GUC variable; and to keep backward compatibility. Add a note near > its definition to use "to_binary_iso" for a standard-conforming output > string. > We talked about this issue, when I wrote function pg_size_bytes. It is hard to fix these functions after years of usage. The new set of functions can be better pg_iso_size_pretty(); pg_iso_size_bytes(); or shorter name pg_isize_pretty(); pg_isize_bytes(); Regards Pavel > > David J. >
Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
2016-07-30 1:46 GMT+02:00 Tomas Vondra : > Hi, > > On 07/29/2016 01:15 PM, Aleksander Alekseev wrote: > >> Hello >> >> Some time ago we discussed an idea of "fast temporary tables": >> >> https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu >> >> In two words the idea is following. >> >> >> >> PostgreSQL stores information about all relations in pg_catalog. Some >> applications create and delete a lot of temporary tables. It causes a >> bloating of pg_catalog and running auto vacuum on it. It's quite an >> expensive operation which affects entire database performance. >> >> We could introduce a new type of temporary tables. Information about >> these tables is stored not in a catalog but in backend's memory. This >> way user can solve a pg_catalog bloating problem and improve overall >> database performance. >> >> >> > > Great! Thanks for the patch, this is definitely an annoying issue worth > fixing. I've spent a bit of time looking at the patch today, comments below > ... > Yes, it some what we need long time > > >> I took me a few months but eventually I made it work. Attached patch >> has some flaws. I decided not to invest a lot of time in documenting >> it or pgindent'ing all files yet. In my experience it will be rewritten >> entirely 3 or 4 times before merging anyway :) But it _works_ and >> passes all tests I could think of, including non-trivial cases like >> index-only or bitmap scans of catalog tables. >> >> > Well, jokes aside, that's a pretty lousy excuse for not writing any docs, > and you're pretty much asking the reviewers to reverse-engineer your > reasoning. So I doubt you'll get many serious reviews without fixing this > gap. > > Usage example: >> >> ``` >> CREATE FAST TEMP TABLE fasttab_test1(x int, s text); >> >> INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc'); >> >> UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2; >> >> DELETE FROM fasttab_test1 WHERE x = 3; >> >> SELECT * FROM fasttab_test1 ORDER BY x; >> >> DROP TABLE fasttab_test1; >> ``` >> >> More sophisticated examples could be find in regression tests: >> >> ./src/test/regress/sql/fast_temp.sql >> >> Any feedback on this patch will be much appreciated! >> >> >> > 1) I wonder whether the FAST makes sense - does this really change the > performance significantly? IMHO you only move the catalog rows to memory, > so why should the tables be any faster? I also believe this conflicts with > SQL standard specification of CREATE TABLE. > Probably has zero value to have slow and fast temp tables (from catalogue cost perspective). So the FAST implementation should be used everywhere. But there are some patterns used with work with temp tables,that should not working, and we would to decide if we prepare workaround or not. -- problematic pattern (old code) IF NOT EXISTS(SELECT * FROM pg_class WHERE ) THEN CREATE TEMP TABLE xxx() ELSE TRUNCATE TABLE xxx; END IF; -- modern patter (new code) BEGIN TRUNCATE TABLE xxx; EXCEPTION WHEN . THEN CREATE TEMP TABLE(...) END; In this case we can use GUC, because visible behave should be same. The benefit of zero catalogue cost temp tables is significant - and for some larger applications the temp tables did hard performance issues. > > 2) Why do we need the new relpersistence value? ISTM we could easily got > with just RELPERSISTENCE_TEMP, which would got right away of many chances > as the steps are exactly the same. > > IMHO if this patch gets in, we should use it as the only temp table > implementation (Or can you think of cases where keeping rows in pg_class > has advantages?). That'd also eliminate the need for FAST keyword in the > CREATE TABLE command. > > The one thin I'm not sure about is that our handling of temporary tables > is not standard compliant - we require each session to create it's own > private temporary table. Moving the rows from pg_class into backend memory > seems to go in the opposite direction, but as no one was planning to fix > this, I don't think it matters much. > > 3) I think the heapam/indexam/xact and various other places needs a major > rework. You've mostly randomly sprinkled the code with function calls to > make the patch work - that's fine for an initial version, but a more > principled approach is needed. > > 4) I'm getting failures in the regression suite - apparently the patch > somehow affects costing of index only scans, so that a several queries > switch from index only scans to bitmap index scans etc. I haven't > investigated this more closely, but it seems quite consistent (and I don't > see it without the patch). It seems the patch delays building of visibility > map, or something like that. > Some other random notes: 1. With this code should not be hard to implement global temp tables - shared persistent structure, temp local data - significant help for any who have to migrate from Oracle. 2. This should to work on slaves - it is one of ToDo 3. I didn't see support for memory store
Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
On Fri, Jul 29, 2016 at 8:18 PM, Bruce Momjian wrote: > On Tue, Jul 12, 2016 at 01:36:38PM +, thomas.ber...@1und1.de wrote: > > The following bug has been logged on the website: > > > > Bug reference: 14244 > > Logged by: Thomas Berger > > Email address: thomas.ber...@1und1.de > > PostgreSQL version: 9.5.3 > > Operating system: any > > Description: > > > > pg_size_pretty uses the suffix "kB" (kilobyte, 10^3 byte), but the > returned > > value is "KB", or "KiB" ( kibibyte, 2^10 byte). This is missleading and > > should be fixed. See also https://en.wikipedia.org/wiki/Kibibyte > > > > =# select pg_size_pretty(1024000::bigint); > > pg_size_pretty > > > > 1000 kB > > (Thread moved to hackers.) > > Yes, we have redefined kB, and documented its use in postgresql.conf and > pg_size_pretty(), but it does not match any recognized standard. > After bouncing on this for a bit I'm inclined to mark the bug itself "won't fix" but introduce a "to_binary_iso" function (I'm hopeful a better name will emerge...) that will output a number using ISO binary suffixes. I would document this under 9.8 "data type formatting functions" instead of within system functions. pg_size_pretty output can continue with a defined role to be used as input into a GUC variable; and to keep backward compatibility. Add a note near its definition to use "to_binary_iso" for a standard-conforming output string. David J.
Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
On Fri, Jul 29, 2016 at 08:18:38PM -0400, Bruce Momjian wrote: > However, that is not the end of the story. Things have moved forward > since 2006 and there is now firm support for either KB or KiB to be > 1024-based units. This blog post explains the current state of prefix > specification: > > > http://pchelp.ricmedia.com/kilobytes-megabytes-gigabytes-terabytes-explained/ > > and here is a summary for 1000/1024-based units: > > Kilobyte (Binary, JEDEC)KB 1024 > Kilobyte (Decimal, Metric) kB 1000 > Kibibyte (Binary, IEC) KiB 1024 Oh, also, here is a Wikipedia article that has a nice chart on the top right: https://en.wikipedia.org/wiki/Binary_prefix and a post that explains some of the background: http://superuser.com/questions/938234/size-of-files-in-windows-os-its-kb-or-kb -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
On Tue, Jul 12, 2016 at 01:36:38PM +, thomas.ber...@1und1.de wrote: > The following bug has been logged on the website: > > Bug reference: 14244 > Logged by: Thomas Berger > Email address: thomas.ber...@1und1.de > PostgreSQL version: 9.5.3 > Operating system: any > Description: > > pg_size_pretty uses the suffix "kB" (kilobyte, 10^3 byte), but the returned > value is "KB", or "KiB" ( kibibyte, 2^10 byte). This is missleading and > should be fixed. See also https://en.wikipedia.org/wiki/Kibibyte > > =# select pg_size_pretty(1024000::bigint); > pg_size_pretty > > 1000 kB (Thread moved to hackers.) The Postgres docs specify that kB is based on 1024 or 2^10: https://www.postgresql.org/docs/9.6/static/functions-admin.html Note: The units kB, MB, GB and TB used by the functions pg_size_pretty and pg_size_bytes are defined using powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is 10242 = 1048576 bytes, and so on. These prefixes were introduced to GUC variable specification in 2006: commit b517e653489f733893d61e7a84c118325394471c Author: Peter Eisentraut Date: Thu Jul 27 08:30:41 2006 + Allow units to be specified with configuration settings. and added to postgresql.conf: # Memory units: kB = kilobytesTime units: ms = milliseconds #MB = megabytes s = seconds #GB = gigabytes min = minutes #TB = terabytes h = hours # d = days and the units were copied when pg_size_pretty() was implemented. These units are based on the International System of Units (SI)/metric. However, the SI system is power-of-10-based, and we just re-purposed them to be 1024 or 2^10-based. However, that is not the end of the story. Things have moved forward since 2006 and there is now firm support for either KB or KiB to be 1024-based units. This blog post explains the current state of prefix specification: http://pchelp.ricmedia.com/kilobytes-megabytes-gigabytes-terabytes-explained/ and here is a summary for 1000/1024-based units: Kilobyte (Binary, JEDEC)KB 1024 Kilobyte (Decimal, Metric) kB 1000 Kibibyte (Binary, IEC) KiB 1024 You will notice that none of these list kB as 1024, which explains this bug report. Yes, we have redefined kB, and documented its use in postgresql.conf and pg_size_pretty(), but it does not match any recognized standard. I am thinking Postgres 10 would be a good time to switch to KB as a 1024-based prefix. Unfortunately, there is no similar fix for MB, GB, etc. 'm' is 'milli' so there we never used mB, so in JEDEC and Metric, MB is ambiguous as 1000-based or 1024-based. IEC does give us a unique specification for 'mega', MiB, and GiB, which might be what we want to use, but that might be too big a change, and I rarely see those. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
Hi, On 07/29/2016 01:15 PM, Aleksander Alekseev wrote: Hello Some time ago we discussed an idea of "fast temporary tables": https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu In two words the idea is following. PostgreSQL stores information about all relations in pg_catalog. Some applications create and delete a lot of temporary tables. It causes a bloating of pg_catalog and running auto vacuum on it. It's quite an expensive operation which affects entire database performance. We could introduce a new type of temporary tables. Information about these tables is stored not in a catalog but in backend's memory. This way user can solve a pg_catalog bloating problem and improve overall database performance. Great! Thanks for the patch, this is definitely an annoying issue worth fixing. I've spent a bit of time looking at the patch today, comments below ... I took me a few months but eventually I made it work. Attached patch has some flaws. I decided not to invest a lot of time in documenting it or pgindent'ing all files yet. In my experience it will be rewritten entirely 3 or 4 times before merging anyway :) But it _works_ and passes all tests I could think of, including non-trivial cases like index-only or bitmap scans of catalog tables. Well, jokes aside, that's a pretty lousy excuse for not writing any docs, and you're pretty much asking the reviewers to reverse-engineer your reasoning. So I doubt you'll get many serious reviews without fixing this gap. Usage example: ``` CREATE FAST TEMP TABLE fasttab_test1(x int, s text); INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc'); UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2; DELETE FROM fasttab_test1 WHERE x = 3; SELECT * FROM fasttab_test1 ORDER BY x; DROP TABLE fasttab_test1; ``` More sophisticated examples could be find in regression tests: ./src/test/regress/sql/fast_temp.sql Any feedback on this patch will be much appreciated! 1) I wonder whether the FAST makes sense - does this really change the performance significantly? IMHO you only move the catalog rows to memory, so why should the tables be any faster? I also believe this conflicts with SQL standard specification of CREATE TABLE. 2) Why do we need the new relpersistence value? ISTM we could easily got with just RELPERSISTENCE_TEMP, which would got right away of many chances as the steps are exactly the same. IMHO if this patch gets in, we should use it as the only temp table implementation (Or can you think of cases where keeping rows in pg_class has advantages?). That'd also eliminate the need for FAST keyword in the CREATE TABLE command. The one thin I'm not sure about is that our handling of temporary tables is not standard compliant - we require each session to create it's own private temporary table. Moving the rows from pg_class into backend memory seems to go in the opposite direction, but as no one was planning to fix this, I don't think it matters much. 3) I think the heapam/indexam/xact and various other places needs a major rework. You've mostly randomly sprinkled the code with function calls to make the patch work - that's fine for an initial version, but a more principled approach is needed. 4) I'm getting failures in the regression suite - apparently the patch somehow affects costing of index only scans, so that a several queries switch from index only scans to bitmap index scans etc. I haven't investigated this more closely, but it seems quite consistent (and I don't see it without the patch). It seems the patch delays building of visibility map, or something like that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Sat, Jul 30, 2016 at 1:17 AM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier >> wrote: >>> While looking at the series of functions pg_get_*, I have noticed as >>> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does >>> not know a user. Perhaps we'd want to change that to NULL for >>> consistency with the rest? > >> That's probably correct in theory, but it's a little less closely >> related, and I'm not entirely sure how far we want to go with this. >> Remember, the original purpose was to avoid having an internal error >> (cache lookup failed, XX000) exposed as a user-visible error message. >> Are we at risk from veering from actual bug-fixing off into useless >> tinkering? Not sure. > > I'd vote for leaving that one alone; yeah, it's a bit inconsistent > now, but no one has complained about its behavior. OK for me. Thanks for the commit. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sslmode=require fallback
On Fri, Jul 29, 2016 at 4:13 PM, Bruce Momjian wrote: > Yes, I am thinking of a case where Postgres is down but a malevolent > user starts a Postgres server on 5432 to gather passwords. Or someone spoofs your DNS lookup, which is an attack that can actually be done remotely in some cases. For what it's worth the SCRAM work also addresses precisely this danger though it doesn't prevent the attacker from pretending to be a real server and capturing private data from the SQL updates. Even in the case where there's no known server certificate it could save the fingerprint seen once and require it not change. This proves to be a headache to manage though. It's equivalent to the SSH known_hosts scheme. How many times have you seen that warning message and just automatically removed the entry in known_hosts without verifying... One day DNSSEC will solve all these problems though. Then you'll just store the certificate in the DNS entry for the server and the client will insist it match. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3
Michael, (dropping -general, not sure why that list was included...) * Michael Paquier (michael.paqu...@gmail.com) wrote: > On Wed, Jul 27, 2016 at 8:07 AM, Stephen Frost wrote: > > That'd be great. It's definitely on my list of things to look into, but > > I'm extremely busy this week. I hope to look into it on Friday, would > > be great to see what you find. > > Sequences that are directly defined in extensions do not get dumped, > and sequences that are part of a serial column in an extension are > getting dumped. Looking into this problem, getOwnedSeqs() is visibly > doing an incorrect assumption: sequences owned by table columns are > dumped unconditionally, but this is not true for sequences that are > part of extensions. More precisely, dobj->dump is being enforced to > DUMP_COMPONENT_ALL, which makes the sequence definition to be dumped. > Oops. Right, it should be set to the same value as the table which is being dumped rather than DUMP_COMPONENT_ALL. Moreover, the owning_tab->dobj.dump check should explicitly check against DUMP_COMPONENT_NONE, but that's just to be neat since that's '0'. > The patch attached fixes the problem for me. I have added as well > tests in test_pg_dump in the shape of sequences defined in an > extension, and sequences that are part of a serial column. This patch > is also able to work in the case where a sequence is created as part > of a serial column, and gets removed after, say with ALTER EXTENSION > DROP SEQUENCE. The behavior for sequences and serial columns that are > not part of extensions is unchanged. This isn't correct as the dump components which are independent of the extension (ACLs, security labels, policies) won't be dumped. Consider, for example, what happens if the user runs: GRANT USAGE ON SEQUENCE TO role; This wouldn't get dumped out with your patch, since we would decide that, because the sequence is a member of the extension, that nothing about it should be dumped, which isn't correct. > Stephen, it would be good if you could check the correctness of this > patch as you did all this refactoring of pg_dump to support catalog > ACLs. I am sure by the way that checking for (owning_tab->dobj.dump && > DUMP_COMPONENT_DEFINITION) != 0 is not good because of for example the > case of a serial column created in an extension where the sequence is > dropped from the extension afterwards. If the sequence is dropped from the extension then the sequence will be ignored by checkExtensionMembership() during selectDumpableObject() and the regular selectDumpableObject() code will handle marking the sequence appropriately. What we need to be doing here is combining the set of components that the sequence has been marked with and the set of components that the table has been marked with, not trying to figure out if the sequence is a member of an extension or not and changing what to do in those cases, that's checkExtensionMembership()'s job, really. Attached is a patch implementing this and which passes the regression tests you added and a couple that I added for the non-extension case. I'm going to look at adding a few more regression tests and if I don't come across any issues then I'll likely push the fix sometime tomorrow. Comments welcome, of course. Thanks! Stephen diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c new file mode 100644 index 08c2b0c..333e628 *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *** getOwnedSeqs(Archive *fout, TableInfo tb *** 6035,6048 if (!OidIsValid(seqinfo->owning_tab)) continue; /* not an owned sequence */ ! if (seqinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) ! continue; /* no need to search */ owning_tab = findTableByOid(seqinfo->owning_tab); ! if (owning_tab && owning_tab->dobj.dump) ! { seqinfo->interesting = true; - seqinfo->dobj.dump = DUMP_COMPONENT_ALL; - } } } --- 6035,6061 if (!OidIsValid(seqinfo->owning_tab)) continue; /* not an owned sequence */ ! owning_tab = findTableByOid(seqinfo->owning_tab); ! ! /* ! * We need to dump the components that are being dumped for the table ! * and any components which the sequence is explicitly marked with. ! * ! * We can't simply use the set of components which are being dumped for ! * the table as the table might be in an extension (and only the ! * non-extension components, eg: ACLs if changed, security labels, and ! * policies, are being dumped) while the sequence is not (and therefore ! * the definition and other components should also be dumped). ! * ! * If the sequence is part of the extension then it should be properly ! * marked by checkExtensionMembership() and this will be a no-op as the ! * table will be equivilantly marked. ! */ ! seqinfo->dobj.dump = seqinfo->dobj.dump | owning_tab->dobj.dump; ! ! if (seqinfo->dobj.dump != DUMP_COMPONENT_NONE) seqinfo->interesting = true; } }
Re: [HACKERS] to_date_valid()
On 7/29/16 1:33 PM, Andreas 'ads' Scherbaum wrote: On 27.07.2016 05:00, Joshua D. Drake wrote: On 07/26/2016 06:25 PM, Peter Eisentraut wrote: On 7/5/16 4:24 AM, Albe Laurenz wrote: But notwithstanding your feeling that you would like your application to break if it makes use of this behaviour, it is a change that might make some people pretty unhappy - nobody can tell how many. What is the use of the existing behavior? You get back an arbitrary implementation dependent value. We don't even guarantee what the value will be. If we changed it to return a different implementation dependent value, would users get upset? No they would not get upset because they wouldn't know. Can we just do the right thing? I'm in favour of fixing this, and update the documentation. +1. I'd say that if users complain we can always create an extension (on PGXN) that offers the old behavior. Users could even put that function before pg_catalog in search_path and get the old behavior back. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Design for In-Core Logical Replication
Robert Haas wrote: > One minor comment is that this document makes extensive use of Terms > With Initial Capitals, which seems stylistically odd, although I'm not > precisely sure what would be better. We use publication on the first use only, which is turned into italics. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated
On 7/29/16 3:13 PM, David Fetter wrote: > I expect this kind of blather from MySQL, but you've brought up > something that's been bothering me for awhile. PostgreSQL's response > should look more like this: > > ERROR: month field value out of range: "2016-99-99" > LINE 1: select cast('2016-99-99' as date); > ^ > Any idea how much effort that would be? This particular case is probably not hard, but the problem is that that would raise the bar about error pointer precision, and you then should also update a bunch of other places to give similar precision. That could be a lot of work. I am, however, of the opinion, that these kinds of things can never be helpful enough. The latest trend is start and end pointers, which would be nice. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Performance of tqueue.c's tuple remapping logic
$SUBJECT sucks. Create a table containing lots of composite arrays: regression=# create table foo (f1 int8_tbl[]); CREATE TABLE regression=# insert into foo select array[row(1,2),row(3,4)]::int8_tbl[] from generate_series (1,1000); INSERT 0 1000 regression=# vacuum analyze foo; VACUUM Establish a baseline for how long it takes to scan this table: regression=# explain analyze select f1 from foo; QUERY PLAN - Seq Scan on foo (cost=0.00..263935.06 rows=1006 width=101) (actual time=0.027..1461.236 rows=1000 loops=1) Planning time: 0.149 ms Execution time: 1996.995 ms (3 rows) ... or select a non-composite value out of it: regression=# explain analyze select f1[1].q1 from foo; QUERY PLAN --- Seq Scan on foo (cost=0.00..263935.06 rows=1006 width=8) (actual time=1.122..3736.121 rows=1000 loops=1) Planning time: 0.077 ms Execution time: 4285.872 ms (3 rows) Now let's try those same queries in parallel mode: regression=# set force_parallel_mode to 1; SET regression=# explain analyze select f1[1].q1 from foo; QUERY PLAN - Gather (cost=1000.00..1264935.66 rows=1006 width=8) (actual time=11.402..12753.782 rows=1000 loops=1) Workers Planned: 1 Workers Launched: 1 Single Copy: true -> Seq Scan on foo (cost=0.00..263935.06 rows=1006 width=8) (actual time=0.182..4523.724 rows=1000 loops=1) Planning time: 0.081 ms Execution time: 13736.321 ms (7 rows) regression=# explain analyze select f1 from foo; QUERY PLAN --- Gather (cost=1000.00..1264935.66 rows=1006 width=101) (actual time=6.659..22693.798 rows=1000 loops=1) Workers Planned: 1 Workers Launched: 1 Single Copy: true -> Seq Scan on foo (cost=0.00..263935.06 rows=1006 width=101) (actual time=0.780..2579.707 rows=1000 loops=1) Planning time: 0.073 ms Execution time: 25925.709 ms (7 rows) So, having to do record-type remapping in the fairly trivial case of two-element composite arrays nearly doubles the already rather excessive runtime for a parallel query returning lots of data. Just to add insult to injury, the backend's memory consumption bloats to something over 5.5G during that last query. Which is not terribly surprising given all the cavalier use of TopMemoryContext in tqueue.c. Since the entire point of parallel query is to deal with large data volumes, I think this means that tqueue.c is simply unfit for production use as it stands. I propose to undertake a thorough code review, and possibly significant rewrite, of tqueue.c over the next few days. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated
On Fri, Jul 29, 2016 at 09:20:12PM +0300, Nikolay Samokhvalov wrote: > On Fri, Jul 29, 2016 at 5:28 PM, Jim Nasby wrote: > > > > The way I sum up MySQL vs PG for people that ask is to recount how they > > "fixed" the Feb. 31st bug when they released strict mode (something that > > they actually called out in the release PR). With strict mode enabled, Feb. > > 30th and 31st would give you an error. Feb 35th was still silently > > converted to March whatever. *That was the MySQL mentality: data quality > > doesn't matter compared to "ease of use".* > > > > They've done this throughout their history... when presented with a hard > > problem, they skip around it or plaster over it, and then they promote that > > their solution is the only right way to solve the problem. (Their docs > > actually used to say that anything other that table-level locking was a bad > > idea.) > > > This is exactly what I mean saying MySQL speaks different language than I > know, and that's why I simply cannot use it: > > (mysql 5.7.12) > > mysql> select cast('2016-99-99' as date); > ++ > | cast('2016-99-99' as date) | > ++ > | NULL | > ++ > 1 row in set, 1 warning (0.00 sec) > > > In Postgres: > > test=# select cast('2016-99-99' as date); > ERROR: date/time field value out of range: "2016-99-99" > LINE 1: select cast('2016-99-99' as date); > ^ I expect this kind of blather from MySQL, but you've brought up something that's been bothering me for awhile. PostgreSQL's response should look more like this: ERROR: month field value out of range: "2016-99-99" LINE 1: select cast('2016-99-99' as date); ^ Any idea how much effort that would be? Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
* Bruce Momjian (br...@momjian.us) wrote: > On Fri, Jul 29, 2016 at 10:44:29AM -0400, Stephen Frost wrote: > > Another thought that was kicking around in my head related to that is if > > we could have indexes that only provide page-level information (similar > > to BRIN, but maybe a btree) and which also would allow HOT updates. > > Those indexes would typically be used in a bitmap index scan where we're > > going to be doing a bitmap heap scan with a recheck, of course, though I > > wonder if we could come up with a way to do an in-order bitmap index > > scan where we sort the tuples on the page and then perform some kind of > > mergejoin recheck (or just pull out whatever the lowest-not-seen each > > time we sort the tuples on the page). > > So allow HOT updates if the updated row is on the same page, even if the > indexed column was changed, by scanning the page --- got it. The idea I had was to allow creation of indexes which *only* include the page ID. Your rephrase seems to imply that we'd have a regular index but then be able to figure out if a given tuple had any HOT updates performed on it and, if so, scan the entire page. As I understand it, it's more complicated than that because we must involve an index when updating a tuple in some cases (UNIQUE?) and therefore we don't perform HOT in the case where any indexed column is being changed. Of course, this only works if these page-level indexes don't support the features that prevent HOT updates today. If we can tell which existing indexes have been built in a such a way to prevent HOT updates and which would work with a HOT updated tuple, then perhaps we could change the HOT code to check that when it's considering if a tuple can be updated using HOT or not and not have only specific indexes able to support HOT updated tuples. This is clearly all hand-wavy, but if the the BRIN indexes could work in this way then it seems like we should be able to generalize what it is about BRIN that allows it and provide a way for other kinds of indexes to support tuples being HOT updated. It appears to be clearly useful in some use-cases. That's really what I was trying to get at. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] to_date_valid()
On 27.07.2016 05:00, Joshua D. Drake wrote: On 07/26/2016 06:25 PM, Peter Eisentraut wrote: On 7/5/16 4:24 AM, Albe Laurenz wrote: But notwithstanding your feeling that you would like your application to break if it makes use of this behaviour, it is a change that might make some people pretty unhappy - nobody can tell how many. What is the use of the existing behavior? You get back an arbitrary implementation dependent value. We don't even guarantee what the value will be. If we changed it to return a different implementation dependent value, would users get upset? No they would not get upset because they wouldn't know. Can we just do the right thing? I'm in favour of fixing this, and update the documentation. But given the discussions in the past, it seemed like people actually depend on this behaviour. Hence the additional function. if this is fixed, it's too late for the current beta. But it's a good time to add a note in the release notes, and advise people that it will be changed in the next release. A workaround can be to rename the current function to something like "to_date_legacy", or "to_date_oracle". And implement the checks in to_date. -- Andreas 'ads' Scherbaum German PostgreSQL User Group European PostgreSQL User Group - Board of Directors Volunteer Regional Contact, Germany - PostgreSQL Project -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated
On Fri, Jul 29, 2016 at 5:28 PM, Jim Nasby wrote: > > The way I sum up MySQL vs PG for people that ask is to recount how they > "fixed" the Feb. 31st bug when they released strict mode (something that > they actually called out in the release PR). With strict mode enabled, Feb. > 30th and 31st would give you an error. Feb 35th was still silently > converted to March whatever. *That was the MySQL mentality: data quality > doesn't matter compared to "ease of use".* > > They've done this throughout their history... when presented with a hard > problem, they skip around it or plaster over it, and then they promote that > their solution is the only right way to solve the problem. (Their docs > actually used to say that anything other that table-level locking was a bad > idea.) This is exactly what I mean saying MySQL speaks different language than I know, and that's why I simply cannot use it: (mysql 5.7.12) mysql> select cast('2016-99-99' as date); ++ | cast('2016-99-99' as date) | ++ | NULL | ++ 1 row in set, 1 warning (0.00 sec) In Postgres: test=# select cast('2016-99-99' as date); ERROR: date/time field value out of range: "2016-99-99" LINE 1: select cast('2016-99-99' as date); ^
Re: [HACKERS] Wrong defeinition of pq_putmessage_noblock since 9.5
I wrote: > Kyotaro HORIGUCHI writes: >> Any work in this area is likely 10.0 material at this point. > I'm not really happy with that, since refactoring it again will create > back-patch hazards. But I see that a lot of the mess here was created > in 9.5, which means we're probably stuck with back-patch hazards anyway. I've pushed Kyotaro-san's original patch, which is clearly a bug fix. I think the additional changes discussed later in this thread are cosmetic, and probably should wait for a more general review of the layering decisions in pqcomm.c. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
Robert Haas writes: > On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier > wrote: >> While looking at the series of functions pg_get_*, I have noticed as >> well that pg_get_userbyid() returns "unknown (OID=%u)" when it does >> not know a user. Perhaps we'd want to change that to NULL for >> consistency with the rest? > That's probably correct in theory, but it's a little less closely > related, and I'm not entirely sure how far we want to go with this. > Remember, the original purpose was to avoid having an internal error > (cache lookup failed, XX000) exposed as a user-visible error message. > Are we at risk from veering from actual bug-fixing off into useless > tinkering? Not sure. I'd vote for leaving that one alone; yeah, it's a bit inconsistent now, but no one has complained about its behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3
On Wed, Jul 27, 2016 at 2:24 AM, Michael Paquier wrote: > On Wed, Jul 27, 2016 at 8:07 AM, Stephen Frost wrote: >> That'd be great. It's definitely on my list of things to look into, but >> I'm extremely busy this week. I hope to look into it on Friday, would >> be great to see what you find. > > Sequences that are directly defined in extensions do not get dumped, > and sequences that are part of a serial column in an extension are > getting dumped. Looking into this problem, getOwnedSeqs() is visibly > doing an incorrect assumption: sequences owned by table columns are > dumped unconditionally, but this is not true for sequences that are > part of extensions. More precisely, dobj->dump is being enforced to > DUMP_COMPONENT_ALL, which makes the sequence definition to be dumped. > Oops. > > The patch attached fixes the problem for me. I have added as well > tests in test_pg_dump in the shape of sequences defined in an > extension, and sequences that are part of a serial column. This patch > is also able to work in the case where a sequence is created as part > of a serial column, and gets removed after, say with ALTER EXTENSION > DROP SEQUENCE. The behavior for sequences and serial columns that are > not part of extensions is unchanged. > > Stephen, it would be good if you could check the correctness of this > patch as you did all this refactoring of pg_dump to support catalog > ACLs. I am sure by the way that checking for (owning_tab->dobj.dump && > DUMP_COMPONENT_DEFINITION) != 0 is not good because of for example the > case of a serial column created in an extension where the sequence is > dropped from the extension afterwards. Stephen, is this still on your list of things for today? Please provide a status update. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Tue, Jul 26, 2016 at 9:27 PM, Michael Paquier wrote: > On Wed, Jul 27, 2016 at 5:11 AM, Robert Haas wrote: >> Committed with minor kibitizing: you don't need an "else" after a >> statement that transfers control out of the function. > > Thanks. Right, I forgot that. > >> Shouldn't >> pg_get_function_arguments, pg_get_function_identity_arguments, >> pg_get_function_result, and pg_get_function_arg_default get the same >> treatment? > > Changing all of them make sense. Please see attached. Committed. > While looking at the series of functions pg_get_*, I have noticed as > well that pg_get_userbyid() returns "unknown (OID=%u)" when it does > not know a user. Perhaps we'd want to change that to NULL for > consistency with the rest? That's probably correct in theory, but it's a little less closely related, and I'm not entirely sure how far we want to go with this. Remember, the original purpose was to avoid having an internal error (cache lookup failed, XX000) exposed as a user-visible error message. Are we at risk from veering from actual bug-fixing off into useless tinkering? Not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wrong defeinition of pq_putmessage_noblock since 9.5
Kyotaro HORIGUCHI writes: > At Fri, 29 Jul 2016 12:47:53 +0900, Michael Paquier > wrote in > >>> At Thu, 28 Jul 2016 10:46:00 -0400, Tom Lane wrote in >>> <4313.1469717...@sss.pgh.pa.us> >>> I dunno, this seems like it's doubling down on some extremely poor >>> decisions. Why is it that you now have to flip a coin to guess whether >>> the prefix is pq_ or socket_ for functions in this module? I would >>> rather see that renaming reverted. >> Yes, I agree with that. I cannot understand the intention behind >> 2bd9e41 to rename those routines as they are now, so getting them back >> with pg_ as prefix looks like a good idea to me. > The many of the socket_* functions are required to be replaced > with mq_* functions for backgroud workers. So reverting the names > of socket_* functions should be accompanied by the need to, for > example, changing their callers to use them explicitly via > PqCommMethods, not hiding with macros, or renaming current pq_* > macros to, say, pqi_. (it stands for PQ-Indirect as a tentative) > I'm not confident on the new prefix since I'm not sure that what > the 'pq' stands for. (Postgres Query?) Hmm. Of course the problem with this approach is that we end up touching a whole bunch of call sites, which sort of negates the point of having installed a compatibility macro layer. [ spends awhile longer looking around at this code ] Maybe the real problem here is that the abstraction layer is so incomplete and apparently haphazard, so that it's not very obvious where you should use a pq_xxx name and where you should refer to socket_xxx. For some of the functions in pqcomm.c, such as internal_flush, it's impossible to tell which side of the abstraction boundary they're supposed to be on. (I suspect that that particular function has simply been ignored on the undocumented assumption that a bgworker could never call it, but that's the kind of half-baked work that I don't find acceptable.) I think the core work that needs to be done here is to clearly identify each function in pqcomm.c as either above or below the PqCommMethods abstraction boundary. Maybe we should split one set or the other out to a new source file. In any case, the naming convention ought to reflect that hierarchy. I withdraw the idea that we should try to revert all these functions back to their old names, since that would obscure the layering distinction between socket-specific and general-usage functions. I now think that the problem is that that distinction hasn't been drawn sharply enough. > The set of functions in PQcommMethods doesn't seem clean. They > are chosen arbitrarily just so that other pq_* functions used in > parallel workers will work as expected. I suppose that it needs > some refactoring. Yeah, exactly. > Any work in this area is likely 10.0 material at this point. I'm not really happy with that, since refactoring it again will create back-patch hazards. But I see that a lot of the mess here was created in 9.5, which means we're probably stuck with back-patch hazards anyway. regards, tom lane PS: "PQ" comes from PostQUEL, I believe. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Design for In-Core Logical Replication
On Wed, Jul 20, 2016 at 4:08 AM, Simon Riggs wrote: > In this post, Petr and I present a joint view on a design for how this > should work in-core, based upon our implementation experiences with physical > replication, pglogical and various comments so far. > > Note that this has substantial user-visible differences from pglogical, > though much of the underlying architecture is reused. > > I should stress that not all of the aspects are implemented yet. The post > here today is a combination of all of our attempts to bring architecture, > usability and security into one place, including a coherent way of > describing the features and how they work. > > Your comments and questions are sought now as we begin the main development > effort to get this into PostgreSQL 10.0 Thanks for publishing this. One minor comment is that this document makes extensive use of Terms With Initial Capitals, which seems stylistically odd, although I'm not precisely sure what would be better. I would have expected that there would be a concept of a REPLICATION SET, defining which tables are to be replicated; here, that seems to be the Publication. That may fine, but I wonder if there is any value in separating those things. It's clear, for example, that a replication set can be dumped: which tables are members of which replication sets is durable metadata. It's less clear that a publication can be dumped; that might include things which are not durable metadata, such as associations with slots. It's generally not really clear to me based on reading this exactly what information is encapsulated in a Publication or a Subscription, which makes it hard to evaluate design decisions like this one: > > The definition of a Publication object will be included within > pg_dump by default when all of the objects in the Publication are > requested as part of the dump specification. > > > Subscriptions are not dumped by pg_dump by default, but can be > requested using --subscriptions parameter. > I think that to really understand exactly what you and Petr have in mind, we'd need a description of where publication and subscription data is stored within the server, and exactly what gets stored. Perhaps that will come in a later email. I'm not bashing the design, exactly, I just can't quite see how all of the pieces fit together yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sslmode=require fallback
On 7/29/16 11:13 AM, Bruce Momjian wrote: > Yes, I am thinking of a case where Postgres is down but a malevolent > user starts a Postgres server on 5432 to gather passwords. Verifying > against an SSL certificate would avoid this problem, so there is some > value in using SSL on localhost. (There is no such security available > for Unix-domain socket connections.) Sure, there is the requirepeer connection option for that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On Fri, Jul 29, 2016 at 10:44:29AM -0400, Stephen Frost wrote: > Another thought that was kicking around in my head related to that is if > we could have indexes that only provide page-level information (similar > to BRIN, but maybe a btree) and which also would allow HOT updates. > Those indexes would typically be used in a bitmap index scan where we're > going to be doing a bitmap heap scan with a recheck, of course, though I > wonder if we could come up with a way to do an in-order bitmap index > scan where we sort the tuples on the page and then perform some kind of > mergejoin recheck (or just pull out whatever the lowest-not-seen each > time we sort the tuples on the page). So allow HOT updates if the updated row is on the same page, even if the indexed column was changed, by scanning the page --- got it. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sslmode=require fallback
On Tue, Jul 19, 2016 at 03:24:26PM -0400, Peter Eisentraut wrote: > On 7/19/16 10:00 AM, Magnus Hagander wrote: > > What could actually be useful there is to explicitly put hostnossl on > > the localhost entries. With the current defaults on the clients, that > > wouldn't break anything, and it would leave people without the > > performance issues that you run into in the default deployments. And for > > localhost it really does't make sense to encrypt -- for the local LAN > > segment that can be argued, but for localhost... > > But even on localhost you ideally want a way to confirm that the server > you are connecting to is the right one, so you might want certificates. > Plus the server might want certificates from the clients. (See also the > occasional discussion about supporting SSL over Unix-domain sockets.) Yes, I am thinking of a case where Postgres is down but a malevolent user starts a Postgres server on 5432 to gather passwords. Verifying against an SSL certificate would avoid this problem, so there is some value in using SSL on localhost. (There is no such security available for Unix-domain socket connections.) -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix comment in ATExecValidateConstraint
On Mon, Jul 25, 2016 at 4:18 AM, Amit Langote wrote: > The comment seems to have been copied from ATExecAddColumn, which says: > > /* > * If we are told not to recurse, there had better not be any > - * child tables; else the addition would put them out of step. > > For ATExecValidateConstraint, it should say something like: > > + * child tables; else validating the constraint would put them > + * out of step. > > Attached patch fixes it. I agree that the current comment is wrong, but what does "out of step" actually mean here, anyway? I think this isn't very clear. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated
On 7/29/16 8:17 AM, Kevin Grittner wrote: On Thu, Jul 28, 2016 at 8:39 PM, Tatsuo Ishii wrote: > BTW, is there any opposite information, i.e. showing the > limitation of MySQL comparing with PostgreSQL? I'm not aware of a general list on the topic, but in reviewing academic papers regarding transaction isolation I did find (and confirm) that MySQL InnoDB relaxes the "strict" aspect of the Strict 2 Phase Locking they use for implementing serializable transactions. "For performance reasons" they drop the locks ... The way I sum up MySQL vs PG for people that ask is to recount how they "fixed" the Feb. 31st bug when they released strict mode (something that they actually called out in the release PR). With strict mode enabled, Feb. 30th and 31st would give you an error. Feb 35th was still silently converted to March whatever. *That was the MySQL mentality: data quality doesn't matter compared to "ease of use".* They've done this throughout their history... when presented with a hard problem, they skip around it or plaster over it, and then they promote that their solution is the only right way to solve the problem. (Their docs actually used to say that anything other that table-level locking was a bad idea.) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
* Jim Nasby (jim.na...@bluetreble.com) wrote: > On 7/28/16 10:05 AM, Alex Ignatov wrote: > >>Just curious: what if PostgreSQL supported index that stores "primary > >>key" (or unique key) instead of tids? > > > >You mean IOT like Oracle have? > > IIRC, IOT either stores the table in index order, which is something > different. IOT is definitely an interesting idea that I'd like to see us pursue, but I agree that it's something different. > What Alex is proposing is an index method that stores a datum > instead of a ctid. You would then use that datum to probe a > different index to get the ctid. Or put simply, you have a PK index > that contains ctid's, and a bunch of other indexes that contain a PK > value instead of ctid's. Right, that's the MySQL approach, which has advantages and disadvantages. > I think it's an idea worth pursuing, but I don't see how you can > make it work with our MVCC system unless we drop the aversion to > scanning back into an index as part of an update. I'm not terribly excited about the MySQL approach, personally, but I really like the idea of trying to make HOT updates smarter and allow HOT updates for indexes which don't include TIDs, as Robert and Alvaro are discussing. Another thought that was kicking around in my head related to that is if we could have indexes that only provide page-level information (similar to BRIN, but maybe a btree) and which also would allow HOT updates. Those indexes would typically be used in a bitmap index scan where we're going to be doing a bitmap heap scan with a recheck, of course, though I wonder if we could come up with a way to do an in-order bitmap index scan where we sort the tuples on the page and then perform some kind of mergejoin recheck (or just pull out whatever the lowest-not-seen each time we sort the tuples on the page). All very hand-wavy, of course, and it'd make sense to make the concept work for BRIN before we consider anything else, but it seems like there could be a use-case for allowing indexes other than BRIN to be built in a way that allows HOT updates to happen, thus eliminating the cost of having to update those indexes when the tuple is changed, in many cases. Of course, those indexes couldn't be used UNIQUE indexes or used for primary keys, and adjusting the parameters to a BRIN index you could possibly get a similar index, but this might allow such an index to still be usable for index-only scans, which a BRIN index will never be able to provide. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Why we lost Uber as a user
On 7/28/16 10:05 AM, Alex Ignatov wrote: Just curious: what if PostgreSQL supported index that stores "primary key" (or unique key) instead of tids? You mean IOT like Oracle have? IIRC, IOT either stores the table in index order, which is something different. What Alex is proposing is an index method that stores a datum instead of a ctid. You would then use that datum to probe a different index to get the ctid. Or put simply, you have a PK index that contains ctid's, and a bunch of other indexes that contain a PK value instead of ctid's. I think it's an idea worth pursuing, but I don't see how you can make it work with our MVCC system unless we drop the aversion to scanning back into an index as part of an update. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup wish list
On Thu, Jul 28, 2016 at 7:34 PM, Fujii Masao wrote: > On Thu, Jul 28, 2016 at 10:16 PM, Amit Kapila wrote: > >> I think there is some value in providing >> .tar for -Z 0, > > I was thinking that "-Ft -Z0" is something like an alias of "-Ft". > That is, the backup is taken in uncompressed tar format. > >> however in that case how should we define usage of -F p >> -Z 0? Shall we say with plain format -Z 0 gets ignored or throw error >> or do something else? If first, then I think it is better to mention >> the same in docs. > > ISTM that it's better to ignore that case, like pg_dump -Ft -Z0 > doesn't throw an error. > Okay, then you can go ahead with your patch. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Strong sides of MySQL" talk from PgDay16Russia, translated
On Thu, Jul 28, 2016 at 8:39 PM, Tatsuo Ishii wrote: > BTW, is there any opposite information, i.e. showing the > limitation of MySQL comparing with PostgreSQL? I'm not aware of a general list on the topic, but in reviewing academic papers regarding transaction isolation I did find (and confirm) that MySQL InnoDB relaxes the "strict" aspect of the Strict 2 Phase Locking they use for implementing serializable transactions. "For performance reasons" they drop the locks acquired during the transaction *before* ensuring crash/recovery persistence. This is more-or-less equivalent to always running with synchronous_commit = off as well as allowing a small window for serialization anomalies in corner cases. The PostgreSQL synchronous_commit option allows a similar performance benefit (where the trade-off is deemed justified) without risking data integrity in the same way. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizing numeric SUM() aggregate
I've tested patch with this query https://gist.github.com/x4m/fee16ed1a55217528f036983d88771b4 Test specs were: Ubuntu 14 LTS VM, dynamic RAM, hypervisor Windows Server 2016, SSD disk, core i5-2500. Configuration: out of the box master make. On 10 test runs timing of select statement: AVG 3739.8 ms, STD 435.4193 With patch applied (as is) : 3017.8 ms, STD 319.893 Increase of overflow const showed no statistically viable performance improvement (though, I do not worth doing). 2016-07-27 17:32 GMT+05:00 Tom Lane : > For that matter, spelling INT_MAX as 0x7FF is also not per project style. Sure, 0x7FF was not for code but for metal arithmetics. I would even add that INT_MAX is system-dependent and varies in different specs. I'd suggest INT32_MAX. Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
Hello Some time ago we discussed an idea of "fast temporary tables": https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu In two words the idea is following. PostgreSQL stores information about all relations in pg_catalog. Some applications create and delete a lot of temporary tables. It causes a bloating of pg_catalog and running auto vacuum on it. It's quite an expensive operation which affects entire database performance. We could introduce a new type of temporary tables. Information about these tables is stored not in a catalog but in backend's memory. This way user can solve a pg_catalog bloating problem and improve overall database performance. I took me a few months but eventually I made it work. Attached patch has some flaws. I decided not to invest a lot of time in documenting it or pgindent'ing all files yet. In my experience it will be rewritten entirely 3 or 4 times before merging anyway :) But it _works_ and passes all tests I could think of, including non-trivial cases like index-only or bitmap scans of catalog tables. Usage example: ``` CREATE FAST TEMP TABLE fasttab_test1(x int, s text); INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc'); UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2; DELETE FROM fasttab_test1 WHERE x = 3; SELECT * FROM fasttab_test1 ORDER BY x; DROP TABLE fasttab_test1; ``` More sophisticated examples could be find in regression tests: ./src/test/regress/sql/fast_temp.sql Any feedback on this patch will be much appreciated! -- Best regards, Aleksander Alekseev diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 0689cc9..940210b 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1693,7 +1693,7 @@ p = permanent table, u = unlogged table, - t = temporary table + t = temporary table, f = fast temporary table diff --git a/src/backend/access/common/Makefile b/src/backend/access/common/Makefile index 1fa6de0..56de4dc 100644 --- a/src/backend/access/common/Makefile +++ b/src/backend/access/common/Makefile @@ -12,7 +12,7 @@ subdir = src/backend/access/common top_builddir = ../../../.. include $(top_builddir)/src/Makefile.global -OBJS = heaptuple.o indextuple.o printtup.o reloptions.o scankey.o \ +OBJS = fasttab.o heaptuple.o indextuple.o printtup.o reloptions.o scankey.o \ tupconvert.o tupdesc.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/access/common/fasttab.c b/src/backend/access/common/fasttab.c new file mode 100644 index 000..0a20247 --- /dev/null +++ b/src/backend/access/common/fasttab.c @@ -0,0 +1,1678 @@ +/* TODO TODO comment the general idea - in-memory tuples and indexes, hooks principle, FasttabSnapshots, etc */ + +#include "c.h" +#include "postgres.h" +#include "pgstat.h" +#include "miscadmin.h" +#include "access/amapi.h" +#include "access/fasttab.h" +#include "access/relscan.h" +#include "access/valid.h" +#include "access/sysattr.h" +#include "access/htup_details.h" +#include "catalog/pg_class.h" +#include "catalog/pg_type.h" +#include "catalog/pg_depend.h" +#include "catalog/pg_inherits.h" +#include "catalog/pg_statistic.h" +#include "storage/bufmgr.h" +#include "utils/rel.h" +#include "utils/inval.h" +#include "utils/memutils.h" + +/* + TYPEDEFS, MACRO DECLARATIONS AND CONST STATIC VARIABLES + */ + +/* #define FASTTAB_DEBUG 1 */ + +#ifdef FASTTAB_DEBUG +static int32 fasttab_scan_tuples_counter = -1; +#endif + +/* list of in-memory catalog tuples */ +typedef struct +{ + dlist_node node; + HeapTuple tup; +} DListHeapTupleData; + +typedef DListHeapTupleData *DListHeapTuple; + +#define FasttabSnapshotIsAnonymous(sn) ( !PointerIsValid((sn)->name) ) +#define FasttabSnapshotIsRoot(sn) ( !PointerIsValid((sn)->prev) ) +#define FasttabTransactionInProgress() \ + ( PointerIsValid(FasttabSnapshotGetCurrent()->prev)) +/* like strcmp but for integer types --- int, uint32, Oid, etc */ +#define FasttabCompareInts(x, y) ( (x) == (y) ? 0 : ( (x) > (y) ? 1 : -1 )) + +struct FasttabSnapshotData; /* forward declaration required for + * relation_is_inmem_tuple_function typedef */ +typedef struct FasttabSnapshotData *FasttabSnapshot; + +typedef bool (*relation_is_inmem_tuple_function) + (Relation relation, HeapTuple tup, FasttabSnapshot fasttab_snapshot, + int tableIdx); + +#define FasttabRelationMaxOidAttributes 2 + +typedef const struct +{ + Oid relationId; + relation_is_inmem_tuple_function is_inmem_tuple_fn; + AttrNumber noidattr; + AttrNumber attrNumbers[FasttabRelationMaxOidAttributes]; +} FasttabRelationMethodsData; + +typedef FasttabRelationMethodsData const *FasttabRelationMethods; + +static bool generic_is_inmem_tuple(Relation relation, HeapTuple tup, + FasttabSnapshot fasttab_snapshot, int tableIdx); +stat
Re: [HACKERS] System load consideration before spawning parallel workers
On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi wrote: > we observed that spawning the specified number of parallel workers for > every query that satisfies for parallelism is sometimes leading to > performance drop compared to improvement during the peak system load > with other processes. Adding more processes to the system is leading > to more context switches thus it reducing the performance of other SQL > operations. > Have you consider to tune using max_worker_processes, basically I think even if you have kept the moderate value for max_parallel_workers_per_gather, the number of processes might increase if total number allowed is much bigger. Are the total number of parallel workers more than number of CPU's/cores in the system? If yes, I think that might be one reason for seeing performance degradation. > In order to avoid this problem, how about adding some kind of system > load consideration into account before spawning the parallel workers? > Hook could be a possibility, but not sure how users are going to decide the number of parallel workers, there might be other backends as well which can consume resources. I think we might need some form of throttling w.r.t assignment of parallel workers to avoid system overload. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL logging problem in 9.4.3?
On Thu, Jul 28, 2016 at 4:59 PM, Michael Paquier wrote: > On Wed, Apr 6, 2016 at 3:11 PM, Michael Paquier > wrote: >> On Wed, Mar 23, 2016 at 12:45 PM, Michael Paquier >> wrote: >>> On Wed, Mar 23, 2016 at 11:11 AM, David Steele wrote: I would prefer not to bump it to the next CF unless we decide this will not get fixed for 9.6. >>> >>> It may make sense to add that to the list of open items for 9.6 >>> instead. That's not a feature. >> >> So I have moved this patch to the next CF for now, and will work on >> fixing it rather soonishly as an effort to stabilize 9.6 as well as >> back-branches. > > Well, not that soon at the end, but I am back on that... I have not > completely reviewed all the code yet, and the case of index relation > referring to a relation optimized with truncate is still broken, but > for now here is a rebased patch if people are interested. I am going > to get as well a TAP tests out of my pocket to ease testing. The patch I sent yesterday was based on an incorrect version. Attached is a slightly-modified version of the last one I found here (https://www.postgresql.org/message-id/56b342f5.1050...@iki.fi), which is rebased on HEAD at ed0b228. I have also converted the test case script of upthread into a TAP test in src/test/recovery that covers 3 cases and I included that in the patch: 1) CREATE + INSERT + COPY => crash 2) CREATE + trigger + COPY => crash 3) CREATE + TRUNCATE + COPY => incorrect number of rows. The first two tests make the system crash, the third one reports an incorrect number of rows. This is registered in next CF by the way: https://commitfest.postgresql.org/10/528/ Thoughts? -- Michael fix-wal-level-minimal-michael-2.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers