Re: [HACKERS] Alter or rename enum value
On 3/28/16 4:42 AM, Emre Hasegeli wrote: Now, we are using a function to replace an enum type on all tables with another one, but we are not at all happy with this solution. It requires all objects which were using the enum to be dropped and recreated, and it rewrites the tables, so it greatly increases the migration time and effort. FWIW, there are ways to avoid some of that pain by having a trigger maintain the new column on INSERT/UPDATE and then slowly touching all the old rows where the new column is NULL. Obviously would be much better if we could just do this with ENUMs... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup tools ought to ensure created backups are durable
On 3/28/16 11:03 AM, Magnus Hagander wrote: That should work yeah. And given that we already use that check in other places, it seems it should be perfectly safe. And as long as we only do a WARNING and not abort if the fsync fails, we should be OK if people intentionally store their backups on an fs that doesn't speak fsync (if that exists), in which case I don't really think we even need a switch to turn it off. I'd even go so far as spitting out a warning any time we can't fsync (maybe that's what you're suggesting?) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IF (NOT) EXISTS in psql-completion
Thank you Pavel, David. Thank you for pointing syntaxes to be addressed. Most of the are addressed in the attached patch. At Tue, 22 Mar 2016 12:57:27 -0400, David Steele wrote in <56f17977.8040...@pgmasters.net> > Hi Kyotaro, > > On 3/18/16 3:22 AM, Pavel Stehule wrote: > > > I am looking this patch. It looks well, but this feature doesn't > > respect upper or lower chars. It enforce upper chars. This is not > > consistent with any other autocomplete. As mentioned before, upper-lower problem is an existing issue. The case of the words in a query result list cannot be edited since it may contain words that should not be changed, such as relation names. So we can address it only before issueing a query but I haven't found simple way to do it. > > I checked it against sql help and these statements doesn't work Thank you very much. > > alter foreign table drop column > > drop cast > > drop operator > > drop transform -- missing autocomplete completely These are not done. Each of them has issues to be addressed before adding completion of IF EXISTS. > > alter text search configuration jjj drop mapping > > alter type hhh drop attribute > > drop extension Done. > > drop text search I don't see the syntax "drop text search [if exists]". drop text search (configuration|dictionary|parser|template) are already addressed. > > drop user mapping "drop user" was not completed with "mapping". I added it then addressed this. (This might be another issue.) > > alter table jjj add column Done if it is mentioning DROP COLUMN. But new two macros HeadMatches6 and 7 are introduced together. > > create temp sequence > > create sequence DROP SEQUENCE is already completed with IF EXISTS. CREATE [TEMP] SEQUENCE with IF NOT EXISTS is added. > Do you have an idea of when you will have a new patch ready? Sorry to to have been late. The attached is the revised version. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 2f46c0aa00fd8fd6357dcb76a26e49e3a66e2572 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Fri, 5 Feb 2016 16:50:35 +0900 Subject: [PATCH] Suggest IF (NOT) EXISTS for tab-completion of psql This patch lets psql to suggest "IF (NOT) EXISTS". Addition to that, since this patch introduces some mechanism for syntactical robustness, it allows psql completion to omit some optional part on matching. --- src/bin/psql/tab-complete.c | 626 1 file changed, 524 insertions(+), 102 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 6a81416..73c5601 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -656,6 +656,10 @@ static const SchemaQuery Query_for_list_of_matviews = { " FROM pg_catalog.pg_roles "\ " WHERE substring(pg_catalog.quote_ident(rolname),1,%d)='%s'" +#define Query_for_list_of_rules \ +"SELECT pg_catalog.quote_ident(rulename) FROM pg_catalog.pg_rules "\ +" WHERE substring(pg_catalog.quote_ident(rulename),1,%d)='%s'" + #define Query_for_list_of_grant_roles \ " SELECT pg_catalog.quote_ident(rolname) "\ " FROM pg_catalog.pg_roles "\ @@ -763,6 +767,11 @@ static const SchemaQuery Query_for_list_of_matviews = { "SELECT pg_catalog.quote_ident(tmplname) FROM pg_catalog.pg_ts_template "\ " WHERE substring(pg_catalog.quote_ident(tmplname),1,%d)='%s'" +#define Query_for_list_of_triggers \ +"SELECT pg_catalog.quote_ident(tgname) FROM pg_catalog.pg_trigger "\ +" WHERE substring(pg_catalog.quote_ident(tgname),1,%d)='%s' AND "\ +" NOT tgisinternal" + #define Query_for_list_of_fdws \ " SELECT pg_catalog.quote_ident(fdwname) "\ " FROM pg_catalog.pg_foreign_data_wrapper "\ @@ -906,7 +915,7 @@ static const pgsql_thing_t words_after_create[] = { {"PARSER", Query_for_list_of_ts_parsers, NULL, THING_NO_SHOW}, {"POLICY", NULL, NULL}, {"ROLE", Query_for_list_of_roles}, - {"RULE", "SELECT pg_catalog.quote_ident(rulename) FROM pg_catalog.pg_rules WHERE substring(pg_catalog.quote_ident(rulename),1,%d)='%s'"}, + {"RULE", Query_for_list_of_rules}, {"SCHEMA", Query_for_list_of_schemas}, {"SEQUENCE", NULL, &Query_for_list_of_sequences}, {"SERVER", Query_for_list_of_servers}, @@ -915,7 +924,7 @@ static const pgsql_thing_t words_after_create[] = { {"TEMP", NULL, NULL, THING_NO_DROP}, /* for CREATE TEMP TABLE ... */ {"TEMPLATE", Query_for_list_of_ts_templates, NULL, THING_NO_SHOW}, {"TEXT SEARCH", NULL, NULL}, - {"TRIGGER", "SELECT pg_catalog.quote_ident(tgname) FROM pg_catalog.pg_trigger WHERE substring(pg_catalog.quote_ident(tgname),1,%d)='%s' AND NOT tgisinternal"}, + {"TRIGGER", Query_for_list_of_triggers}, {"TYPE", NULL, &Query_for_list_of_datatypes}, {"UNIQUE", NULL, NULL, THING_NO_DROP}, /* for CREATE UNIQUE INDEX ... */ {"UNLOGGED", NULL, NULL, THING_NO_DROP}, /* for CREATE UNLOGGED TABLE @@ -944,6 +953,7 @@ static char **complete_from_variables(const char *text, const char *prefix, const char *suffix, bool
Re: [HACKERS] Odd system-column handling in postgres_fdw join pushdown patch
On 2016/03/25 17:16, Etsuro Fujita wrote: The approach that we discussed would minimize the code for the FDW author to write, by providing the support functions you proposed. I'll post a patch for that early next week. I added two helper functions: GetFdwScanTupleExtraData and FillFdwScanTupleSysAttrs. The FDW author could use the former to get info about system attributes other than ctids and oids in fdw_scan_tlist during BeginForeignScan, and the latter to set values for these system attributes during IterateForeignScan (InvalidTransactionId for xmins/xmaxs, InvalidCommandId for cmins/cmaxs, and valid values for tableoids). Attached is a proposed patch for that. I also slightly simplified the changes to make_tuple_from_result_row and conversion_error_callback made by the postgres_fdw join pushdown patch. What do you think about that? Best regards, Etsuro Fujita postgres-fdw-join-pushdown-syscol-handling-v2.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Sync tzload() and tzparse() APIs with IANA release tzcode2016c.
On Mon, Mar 28, 2016 at 10:09 PM, Tom Lane wrote: > Christian Ullrich writes: >> * Tom Lane wrote: >>> Yeah. I've been staring at that for awhile, but it's not clear where >>> the problem is. There are a bunch of other SET TIME ZONE commands in >>> the regression tests, how is it that this trivial case fails on the >>> Windows critters? > >> zic aborts somewhere between writing Etc/UTC and UTC. > > Huh ... I would not have guessed that. Can you track down exactly > where it's failing? It is failing when attempting to create the link for Pacific_new, and so the rest is not created at all. > We absorbed some new code in zic.c for creating subdirectories of the > target timezone directory, and said code seemed a bit odd to me, > but I supposed that the IANA guys had debugged it already. Maybe not. > Or maybe the problem is with some specific input file that gets > reached somewhere in that range? The issue is in dolink() visibly. The first time link() is called it fails on EEXIST (?), but there is no retry because the target link, US/Pacific-new is not a directory. But note that contrary to the previous version of the code, link() is not attempted again, symlink is called instead (HAVE_SYMLINK is actually missing here!) or an open/getc/close is done to do a copy of the file. With the WIP patch attached, I am still getting 4 warnings "symbolic link used because hard link failed", at least it fixes the issue. -- Michael zic-ms-fix.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extend pgbench expressions with functions
I guess the question here is whether we want the part-c patch, which removes the historical \setrandom syntax. That's technically no longer needed since we now can use random() as a function directly inside \set commands, but we might want it anyway for backward compatibility. This patch is indeed a proposal. Anybody have an opinion on that? +1 for nuking it. That's not worth the trouble maintaining it. I share Michaël opinion. Some arguments for removing it: - \setrandom is syntactically inhomogeneous in the overall syntax, and is now redundant - switching to the \set syntax is pretty easy, see attached script - custom scripts are short, they are used by few but advance users, for which updating would not be an issue - the parsing & execution codes are lengthy, repetitive... -- Fabien.#! /usr/bin/perl -wp s/^\\setrandom\s+(\S+)\s+(\S+)\s+(\S+)\s+(exponential|gaussian)\s+(\S+)/\\set $1 random_$4($2, $3, $5)/; s/^\\setrandom\s+(\S+)\s+(\S+)\s+(\S+)(\s+uniform)?/\\set $1 random($2, $3)/; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Sync tzload() and tzparse() APIs with IANA release tzcode2016c.
Christian Ullrich writes: > * Tom Lane wrote: >> Yeah. I've been staring at that for awhile, but it's not clear where >> the problem is. There are a bunch of other SET TIME ZONE commands in >> the regression tests, how is it that this trivial case fails on the >> Windows critters? > zic aborts somewhere between writing Etc/UTC and UTC. Huh ... I would not have guessed that. Can you track down exactly where it's failing? We absorbed some new code in zic.c for creating subdirectories of the target timezone directory, and said code seemed a bit odd to me, but I supposed that the IANA guys had debugged it already. Maybe not. Or maybe the problem is with some specific input file that gets reached somewhere in that range? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Sync tzload() and tzparse() APIs with IANA release tzcode2016c.
On Tue, Mar 29, 2016 at 1:36 PM, Christian Ullrich wrote: > * Tom Lane wrote: > >> Michael Paquier writes: >>> >>> Buildfarm-not-being-happy-status: woodloose, mastodon, thrips, jacana. >>> >>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=woodlouse&dt=2016-03-29%2000%3A42%3A08 >>> The origin of the problem is that, which prevents all the subsequent >>> queries to fail: >>>SET TimeZone to 'UTC'; >>> + ERROR: invalid value for parameter "TimeZone": "UTC" >> >> >> Yeah. I've been staring at that for awhile, but it's not clear where >> the problem is. There are a bunch of other SET TIME ZONE commands in >> the regression tests, how is it that this trivial case fails on the >> Windows critters? > > > I think this is the reason, from the check log on woodlouse (jacana says the > same in make style): > > Generating timezone files...release\zic\zic: Can't create > C:/buildfarm/buildenv/HEAD/pgsql.build/tmp_install/share/timezone/US/Pacific-New: > No such file or directory Yes, I have bumped into that when running the build. And I think that the error is in zic.c, in dolink() when performing a Link operation because this parent directory is not created because of that: if (link_errno == ENOENT || link_errno == ENOTSUP) { if (!mkdirs(toname)) exit(EXIT_FAILURE); retry_if_link_supported = true; } I think that we'd want here to check as well on EISDIR or EACCES... Haven't checked yet though. I'll update this thread with hopefully a patch. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: pgbench: Support double constants and functions.
On Tue, Mar 29, 2016 at 1:56 PM, Tom Lane wrote: > Michael Paquier writes: >> On Tue, Mar 29, 2016 at 9:52 AM, Robert Haas wrote: >>> pgbench: Support double constants and functions. > >> Instead of INT64_MIN and INT64_MAX, I think that it would be better to >> use PG_INT64_MIN and PG_INT64_MAX. > > Indeed. > >> Note as well that DOUBLE is an >> existing variable type in VS (while INTEGER is not), > > Ooops; that one was harder to foresee. Thanks for the push. >> A way to fix compilation here is to rename those tokens to something >> that will never conflict, like in the attached to DOUBLE_VAR and >> INTEGER_VAR, both exprparse.y and exprparse.l need an update. > > Agreed, but these symbols represent constants not variables, so > I used DOUBLE_CONST and INTEGER_CONST instead. Pushed with those > corrections, so that we can get back to looking at my own bugs :-( I think I know what's going on here... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: pgbench: Support double constants and functions.
Michael Paquier writes: > On Tue, Mar 29, 2016 at 9:52 AM, Robert Haas wrote: >> pgbench: Support double constants and functions. > Instead of INT64_MIN and INT64_MAX, I think that it would be better to > use PG_INT64_MIN and PG_INT64_MAX. Indeed. > Note as well that DOUBLE is an > existing variable type in VS (while INTEGER is not), Ooops; that one was harder to foresee. > A way to fix compilation here is to rename those tokens to something > that will never conflict, like in the attached to DOUBLE_VAR and > INTEGER_VAR, both exprparse.y and exprparse.l need an update. Agreed, but these symbols represent constants not variables, so I used DOUBLE_CONST and INTEGER_CONST instead. Pushed with those corrections, so that we can get back to looking at my own bugs :-( regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On Tue, Mar 29, 2016 at 3:21 AM, Petr Jelinek wrote: > On 28/03/16 14:46, Dilip Kumar wrote: > >> >> Conclusion: >> --- >> 1. I think v15 is solving the problem exist with v13 and performance >> is significantly high compared to base, and relation size is also >> stable, So IMHO V15 is winner over other solution, what other thinks ? >> >> > It seems so, do you have ability to reasonably test with 64 clients? I am > mostly wondering if we see the performance going further down or just > plateau. > > Yes, that makes sense. One more point is that if the reason for v13 giving better performance is extra blocks (which we believe in certain cases can leak till the time Vacuum updates the FSM tree), do you think it makes sense to once test by increasing lockWaiters * 20 limit to may be lockWaiters * 25 or lockWaiters * 30. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
[HACKERS] Re: [COMMITTERS] pgsql: Sync tzload() and tzparse() APIs with IANA release tzcode2016c.
* Tom Lane wrote: Michael Paquier writes: Buildfarm-not-being-happy-status: woodloose, mastodon, thrips, jacana. http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=woodlouse&dt=2016-03-29%2000%3A42%3A08 The origin of the problem is that, which prevents all the subsequent queries to fail: SET TimeZone to 'UTC'; + ERROR: invalid value for parameter "TimeZone": "UTC" Yeah. I've been staring at that for awhile, but it's not clear where the problem is. There are a bunch of other SET TIME ZONE commands in the regression tests, how is it that this trivial case fails on the Windows critters? I think this is the reason, from the check log on woodlouse (jacana says the same in make style): Generating timezone files...release\zic\zic: Can't create C:/buildfarm/buildenv/HEAD/pgsql.build/tmp_install/share/timezone/US/Pacific-New: No such file or directory zic aborts somewhere between writing Etc/UTC and UTC. This is how the toplevel directory ends up after the error: Africa America Antarctica Arctic Asia Atlantic Australia 2.102 CET 2.294 CST6CDT 1.876 EET 127 EST 2.294 EST5EDT Etc Europe 264 Factory 128 HST Indian 2.102 MET 127 MST 2.294 MST7MDT Pacific 2.294 PST8PDT 1.873 WET After I manually created the "US" directory, zic had no further trouble putting files into it. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: pgbench: Support double constants and functions.
On Tue, Mar 29, 2016 at 9:52 AM, Robert Haas wrote: > pgbench: Support double constants and functions. > > The new functions are pi(), random(), random_exponential(), > random_gaussian(), and sqrt(). I was worried that this would be > slower than before, but, if anything, it actually turns out to be > slightly faster, because we now express the built-in pgbench scripts > using fewer lines; each \setrandom can be merged into a subsequent > \set. Buildfarm is complaining about a couple of things here: src/bin/pgbench/exprparse.c(139): error C2365: 'DOUBLE' : redefinition; previous definition was 'typedef' [C:\buildfarm\buildenv\HEAD\pgsql.build\pgbench.vcxproj] src/bin/pgbench/pgbench.c(1046): error C2065: 'INT64_MIN' : undeclared identifier [C:\buildfarm\buildenv\HEAD\pgsql.build\pgbench.vcxproj] src/bin/pgbench/exprparse.c(139): error C2086: 'yytokentype DOUBLE' : redefinition [C:\buildfarm\buildenv\HEAD\pgsql.build\pgbench.vcxproj] src/bin/pgbench/pgbench.c(1046): error C2065: 'INT64_MAX' : undeclared identifier [C:\buildfarm\buildenv\HEAD\pgsql.build\pgbench.vcxproj] src/bin/pgbench/exprscan.l(131): error C2275: 'DOUBLE' : illegal use of this type as an expression (src/bin/pgbench/exprparse.c) [C:\buildfarm\buildenv\HEAD\pgsql.build\pgbench.vcxproj] Instead of INT64_MIN and INT64_MAX, I think that it would be better to use PG_INT64_MIN and PG_INT64_MAX. Note as well that DOUBLE is an existing variable type in VS (while INTEGER is not), and it is true that it is not directly documented: https://msdn.microsoft.com/en-us/library/windows/desktop/ms221627%28v=vs.85%29.aspx A way to fix compilation here is to rename those tokens to something that will never conflict, like in the attached to DOUBLE_VAR and INTEGER_VAR, both exprparse.y and exprparse.l need an update. -- Michael pgbench-fix-vs.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql stops completion for the only matching schema name.
Hello, I found that COMPLETE_SCHEMA_QUERY doesn't complete only-one-matching schema names. =# alter foreign table information_schema. pg_temp_1. pg_toast_temp_1. pg_catalog. pg_toast.public. =# alter foreign table i or =# alter foreign table pu This makes more perplexing behavior with addon query. =# create schema alpha; =# alter table ALL IN TABLESPACEpg_catalog. pg_toast_temp_1. alpha. pg_temp_1. public. information_schema. pg_toast. =# alter table a =# alter table ALL IN TABLESPACE _ Where my alpha has gone? _complete_from_query adds schema names only if more than one shcmea names match the current input. The 'i' matches only one schema name. The reason for the behavior seems not to add a schema name when qualified names in the schema is added. So the attached patch does exactly the thing to do. psql behaves as the following with this patch. =# create schema alpha; =# create table alpha.a (a int); =# create table alpha.b (a int); =# alter table ALL IN TABLESPACEpg_catalog. pg_toast_temp_1. alpha. pg_temp_1. public. information_schema. pg_toast. =# alter table al ALL IN TABLESPACE alpha.aalpha.b =# alter table alp =# alter table alpha. alpha.a alpha.b This seems to me the intended behavior here. Any comments? regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From c3a1220ecec07af660fc184e5a1d24baa1fe913b Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Tue, 29 Mar 2016 12:13:03 +0900 Subject: [PATCH] Fix the condition to decide whether to add schema names. Currently, COMPLETE_WITH_SCHEMA_QUERY stops completetion for the only matching schema name for the current input. Schema names are added only when there's more than one matches for current input but the correct condition is that there's no qualified names to match. No qualified names is listed only when there's exactly one matching schema name. --- src/bin/psql/tab-complete.c | 52 - 1 file changed, 28 insertions(+), 24 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 6a81416..631d0d7 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -3206,7 +3206,7 @@ _complete_from_query(int is_schema_query, const char *text, int state) */ if (state == 0) { - PQExpBufferData query_buffer; + PQExpBufferData query_buffer, tmp_buffer; char *e_text; char *e_info_charp; char *e_info_charp2; @@ -3279,26 +3279,12 @@ _complete_from_query(int is_schema_query, const char *text, int state) } /* - * Add in matching schema names, but only if there is more than - * one potential match among schema names. + * Make query to collect matching qualified names, but only if + * there is exactly one schema matching the input-so-far. This + * query is made separatedly since used twice. */ - appendPQExpBuffer(&query_buffer, "\nUNION\n" - "SELECT pg_catalog.quote_ident(n.nspname) || '.' " - "FROM pg_catalog.pg_namespace n " - "WHERE substring(pg_catalog.quote_ident(n.nspname) || '.',1,%d)='%s'", - char_length, e_text); - appendPQExpBuffer(&query_buffer, - " AND (SELECT pg_catalog.count(*)" - " FROM pg_catalog.pg_namespace" - " WHERE substring(pg_catalog.quote_ident(nspname) || '.',1,%d) =" - " substring('%s',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) > 1", - char_length, e_text); - - /* - * Add in matching qualified names, but only if there is exactly - * one schema matching the input-so-far. - */ - appendPQExpBuffer(&query_buffer, "\nUNION\n" + initPQExpBuffer(&tmp_buffer); + appendPQExpBuffer(&tmp_buffer, "SELECT pg_catalog.quote_ident(n.nspname) || '.' || %s " "FROM %s, pg_catalog.pg_namespace n " "WHERE %s = n.oid AND ", @@ -3306,9 +3292,9 @@ _complete_from_query(int is_schema_query, const char *text, int state) completion_squery->catname, completion_squery->namespace); if (completion_squery->selcondition) -appendPQExpBuffer(&query_buffer, "%s AND ", +appendPQExpBuffer(&tmp_buffer, "%s AND ", completion_squery->selcondition); - appendPQExpBuffer(&query_buffer, "substring(pg_catalog.quote_ident(n.nspname) || '.' || %s,1,%d)='%s'", + appendPQExpBuffer(&tmp_buffer, "substring(pg_catalog.quote_ident(n.nspname) || '.' || %s,1,%d)='%s'", qualresult, char_length, e_text); @@ -3316,17 +3302,35 @@ _complete_from_query(int is_schema_query, const char *text, int state) * This condition exploits the single-matching-schema rule to * speed up the query */ - appendPQExpBuffer(&query_buffer, + appendPQExpBuffer(&tmp_buffer, " AND substring(pg_catalog.quote_ident(n.nspname) || '.',1,%d) =" " substring('%s',1,pg_ca
Re: [HACKERS] raw output from copy
Hi >>> Anyway this is certainly not committable as-is, so I'm setting it back >>> to Waiting on Author. But the fact that both libpq and ecpg would need >>> updates makes me question whether we can safely pretend that this isn't >>> a protocol break. >>> >>> >>> >> >> >> In that case I humbly submit that there is a case for reviving the psql >> patch I posted that kicked off this whole thing and lets you export a piece >> of binary data from psql quite easily. It should certainly not involve any >> protocol break. >> > > The psql only solution can work only for output. Doesn't help with input. > In this case, I am thinking so the features of COPY statement is perfect for this feature. The way from a content to the file is direct. In psql you have to off - tuple separator, record separator, you have to set output file. You can get same effect, but with more work. In previous version it was relatively hard to use it from command line - now, with multi command -c is much simpler, but still the COPY is the ideal. I agree, so output formats of psql is nice feature. And should be pretty nice, if we support more common formats - like csv, simple xml, simple json. I believe so sometime the redundancy is acceptable, if the cost is not too high. sorry for offtopic - I would to see some output format on client side, but the format possibilities are on server side. So there are natural idea - define server side output format. psql output format just can wrap it. Regards Pavel > Regards > > Pavel > > >> >> cheers >> >> andrew >> > >
Re: [HACKERS] raw output from copy
Hi 2016-03-29 0:26 GMT+02:00 Tom Lane : > Pavel Stehule writes: > > [ copy-raw-format-20160227-03.patch ] > > I looked at this patch. I'm having a hard time accepting that it has > a use-case large enough to justify it, and here's the reason: it's > a protocol break. Conveniently omitting to update protocol.sgml > doesn't make it not a protocol break. (libpq.sgml also contains > assorted statements that are falsified by this patch.) > The reply on this question depends how we would to be strict. This doesn't change the format in types stream, but it creates new enum value. Correctly written should to raise exception when is processing unknown enum value. I'll do tests against old libpq. > > You could argue that it's the user's own fault if he tries to use > COPY RAW with client-side code that hasn't been updated to support it. > Maybe that's okay, but I wonder if we're opening ourselves up to > problems. Maybe even security-grade problems. > > In terms of specific code that hasn't been updated, ecpg is broken > by this patch, and I'm not very sure what libpq's PQbinaryTuples() > ought to do but probably something other than what it does today. > > There's also a definitional question of what we think PQfformat() ought > to do; should it return "2" for the per-field format? Or maybe the > per-field format is still "1", since it's after all the same binary data > format as for COPY BINARY, and only the overall copy format reported by > PQbinaryTuples() should change to "2". > > I'll recheck it > BTW, I'm not really sure why the patch is trying to enforce single > row and column for the COPY OUT case. I thought the idea for that > was that we'd just shove out the data without any delimiters, and > if it's more than one datum it's the user's problem whether he can > identify the boundaries. On the input side we would have to insist > on one column since we're not going to attempt to identify boundaries > (and one row would fall out of the fact that we slurp the entire input > and treat it as one datum). > It should not be problem. I though about it. The COPY statements is extensible with options. We can support more fields, more rows if it will be required with additional options. But now, it looks like premature optimization. > > Anyway this is certainly not committable as-is, so I'm setting it back > to Waiting on Author. But the fact that both libpq and ecpg would need > updates makes me question whether we can safely pretend that this isn't > a protocol break. > I'll do test against some clients. Regards Pavel > > regards, tom lane >
Re: [HACKERS] [COMMITTERS] pgsql: Sync tzload() and tzparse() APIs with IANA release tzcode2016c.
Michael Paquier writes: > Buildfarm-not-being-happy-status: woodloose, mastodon, thrips, jacana. > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=woodlouse&dt=2016-03-29%2000%3A42%3A08 > The origin of the problem is that, which prevents all the subsequent > queries to fail: > SET TimeZone to 'UTC'; > + ERROR: invalid value for parameter "TimeZone": "UTC" Yeah. I've been staring at that for awhile, but it's not clear where the problem is. There are a bunch of other SET TIME ZONE commands in the regression tests, how is it that this trivial case fails on the Windows critters? Probably the first thing to eliminate is whether it's zic that is messing up (and writing a bad zone file) or the backend. Can someone compare the $INSTALLDIR/share/timezone/UTC files between a working build and a failing one? They should be bitwise identical (but note I'm not expecting them to be identical to files from before the tzcode merge commit). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] raw output from copy
2016-03-29 5:12 GMT+02:00 Andrew Dunstan : > > > On 03/28/2016 06:26 PM, Tom Lane wrote: > >> Pavel Stehule writes: >> >>> [ copy-raw-format-20160227-03.patch ] >>> >> I looked at this patch. I'm having a hard time accepting that it has >> a use-case large enough to justify it, and here's the reason: it's >> a protocol break. Conveniently omitting to update protocol.sgml >> doesn't make it not a protocol break. (libpq.sgml also contains >> assorted statements that are falsified by this patch.) >> >> You could argue that it's the user's own fault if he tries to use >> COPY RAW with client-side code that hasn't been updated to support it. >> Maybe that's okay, but I wonder if we're opening ourselves up to >> problems. Maybe even security-grade problems. >> >> In terms of specific code that hasn't been updated, ecpg is broken >> by this patch, and I'm not very sure what libpq's PQbinaryTuples() >> ought to do but probably something other than what it does today. >> >> There's also a definitional question of what we think PQfformat() ought >> to do; should it return "2" for the per-field format? Or maybe the >> per-field format is still "1", since it's after all the same binary data >> format as for COPY BINARY, and only the overall copy format reported by >> PQbinaryTuples() should change to "2". >> >> BTW, I'm not really sure why the patch is trying to enforce single >> row and column for the COPY OUT case. I thought the idea for that >> was that we'd just shove out the data without any delimiters, and >> if it's more than one datum it's the user's problem whether he can >> identify the boundaries. On the input side we would have to insist >> on one column since we're not going to attempt to identify boundaries >> (and one row would fall out of the fact that we slurp the entire input >> and treat it as one datum). >> >> Anyway this is certainly not committable as-is, so I'm setting it back >> to Waiting on Author. But the fact that both libpq and ecpg would need >> updates makes me question whether we can safely pretend that this isn't >> a protocol break. >> >> >> > > > In that case I humbly submit that there is a case for reviving the psql > patch I posted that kicked off this whole thing and lets you export a piece > of binary data from psql quite easily. It should certainly not involve any > protocol break. > The psql only solution can work only for output. Doesn't help with input. Regards Pavel > > cheers > > andrew >
Re: [HACKERS] Using quicksort for every external sort run
On Thu, Mar 10, 2016 at 6:54 PM, Peter Geoghegan wrote: > I've used amcheck [2] to test this latest revision -- the tool ought > to not see any problems with any index created with the patch applied. > Reviewers might find it helpful to use amcheck, too. As 9.6 is > stabilized, I anticipate that amcheck will give us a fighting chance > at early detection of any bugs that might have slipped into tuplesort, > or a B-Tree operator class. Since we still don't even have one single > test of the external sort code [3], it's just as well. If we wanted to > test external sorting, maybe we'd do that by adding tests to amcheck, > that are not run by default, much like test_decoding, which tests > logical decoding but is not targeted by "make installcheck"; that > would allow the tests to be fairly comprehensive without being > annoying. Using amcheck neatly side-steps issues with the portability > of "expected" pg_regress output when collatable type sorting is > tested. Note that amcheck V2, which I posted just now features tests for external sorting. The way these work requires discussion. The tests are motivated in part by the recent strxfrm() debacle, as well as by the need to have at least some test coverage for this patch. It's bad that external sorting currently has no test coverage. We should try and do better there as part of this overhaul to tuplesort.c. Thanks -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] amcheck (B-Tree integrity checking tool)
On Tue, Mar 22, 2016 at 10:57 AM, Peter Geoghegan wrote: > That's right - I have a small number of feedback items to work > through. I also determined myself that there could be a very low > probability race condition when checking the key space across sibling > pages, and will work to address that. I attach a V2 revision. Behavioral changes: * Fixes very low probability race condition when verifying ordering across sibling pages with bt_index_check()/AccessShareLock. Notably, the fix does not involve using the high key from the next page instead of the first item, despite my recent suggestion that it would; I changed my mind. In this revision, bt_right_page_check_scankey() continues to return the first item on the next/right page (as a scankey), just as in the first version. The locking considerations here are more complicated than before, and hopefully correct. I abandoned the high key idea when I realized that a concurrent page split could happen in the right sibling, making the high key generally no more safe in the face of concurrent target page deletion than continuing to use the first data item. Using the first data item from the right page is better than using the high key item from the right page verification-wise anyway, so that's good. I found a new way of locking down and recovering from any race conditions (the low probability bug in V1 for bt_index_check()/AccessShareLock calls to the C function bt_right_page_check_scankey()). I won't go into the gory details here, but I will mention that my new approach is inspired by how backwards scans work already. * Adds DEBUG1 traces that indicate the level and page currently being checked. Advanced users will probably find these traces useful on occasion. It's nice to be able to see how amcheck walks the B-Tree in practice. Non-behavioral changes: * Explains why an ExclusiveLock is needed on the target index by bt_index_parent_check() (and a ShareLock on its heap relation). This new explanation talks about why an ExclusiveLock *is* necessary when checking downlinks against child pages, per Tomas Vondra's request. (Note that nothing changes about child/downlink verification itself). Before now, I focused on why they were not necessary, but Tomas was right to point out that that isn't enough information. * No longer uses the word "comport" in an error message, per Amit Langote. Similarly, minor typos fixed, per Tomas. * Adds a variety of pg_regress-based tests for external sorting. (Via various CREATE INDEX statements involving a variety of datatypes. Data is generated pseudo-randomly.) Notably, the regression tests *never* test external sorting today. ISTM that amcheck is a convenient, reliable, and portable way of testing the sorting of collatable types like text, so this is the best way of testing external sorting. External sorting uses abbreviated keys for runs, but discards them as runs are written out, and so the merging of runs does not use abbreviated keys. Abbreviated comparisons and authoritative comparisons are therefore reliably used within the same sort. One particular goal here is that something like the recent strxfrm() debacle might be caught by this testing on some platforms (if we ever get back to trusting strxfrm() at all, that is). The style of these new tests is unorthodox, because a variety of collations are tested, which varies from system to system; we should discuss all this. The tests can take a minute or two to run on my laptop, but that could vary significantly. And so, whether or not that needs to be targeted by something other than "check" and "installcheck" is a discussion that needs to happen. I imagine a lot of buildfarm animals have slow storage, but on the other hand I don't think hackers run the contrib regression tests so often. I prefer to not follow the example of test_decoding if possible; test_decoding's Makefile only runs tests when wal_level=logical is expected (that's how the relevant Makefile targets are scoped), but it's a complicated special case. Review -- As already noted after the first bullet point, there are some tricky considerations on fixing the race when the user calls bt_index_check(). Reviewers should pay close attention to this; could I be wrong about that being race-safe even now? This seems to me to be by far the best place for reviewers to look for problems in this patch, as it's the only bt_index_check()/AccessShareLock case that compares anything *across* pages. The measures taken to prevent false positives described within bt_target_page_check() require expert review, especially because this is the one case where a faulty rationale may result in under-protection from false positives, and not just harmless over-protection. The rationale for why we can reliably recover from a race described within the C function bt_right_page_check_scankey() is *complicated*. I may have failed to make it as simple as possible despite significant effort. It's hard to describe these things
Re: [HACKERS] raw output from copy
On 03/28/2016 06:26 PM, Tom Lane wrote: Pavel Stehule writes: [ copy-raw-format-20160227-03.patch ] I looked at this patch. I'm having a hard time accepting that it has a use-case large enough to justify it, and here's the reason: it's a protocol break. Conveniently omitting to update protocol.sgml doesn't make it not a protocol break. (libpq.sgml also contains assorted statements that are falsified by this patch.) You could argue that it's the user's own fault if he tries to use COPY RAW with client-side code that hasn't been updated to support it. Maybe that's okay, but I wonder if we're opening ourselves up to problems. Maybe even security-grade problems. In terms of specific code that hasn't been updated, ecpg is broken by this patch, and I'm not very sure what libpq's PQbinaryTuples() ought to do but probably something other than what it does today. There's also a definitional question of what we think PQfformat() ought to do; should it return "2" for the per-field format? Or maybe the per-field format is still "1", since it's after all the same binary data format as for COPY BINARY, and only the overall copy format reported by PQbinaryTuples() should change to "2". BTW, I'm not really sure why the patch is trying to enforce single row and column for the COPY OUT case. I thought the idea for that was that we'd just shove out the data without any delimiters, and if it's more than one datum it's the user's problem whether he can identify the boundaries. On the input side we would have to insist on one column since we're not going to attempt to identify boundaries (and one row would fall out of the fact that we slurp the entire input and treat it as one datum). Anyway this is certainly not committable as-is, so I'm setting it back to Waiting on Author. But the fact that both libpq and ecpg would need updates makes me question whether we can safely pretend that this isn't a protocol break. In that case I humbly submit that there is a case for reviving the psql patch I posted that kicked off this whole thing and lets you export a piece of binary data from psql quite easily. It should certainly not involve any protocol break. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reworks of CustomScan serialization/deserialization
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > Sent: Tuesday, March 29, 2016 10:54 AM > To: Kaigai Kouhei(海外 浩平) > Cc: Petr Jelinek; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Reworks of CustomScan serialization/deserialization > > On Mon, Mar 28, 2016 at 9:36 PM, Kouhei Kaigai wrote: > > I don't have a strong reason to keep these stuff in separate files. > > Both stuffs covers similar features and amount of code are enough small. > > So, the attached v4 just merged custom-node.[ch] stuff into extensible. > > > > Once we put similar routines closely, it may be better to consolidate > > these routines. > > As long as EXTNODENAME_MAX_LEN == CUSTOM_NAME_MAX_LEN, both features > > have identical structure layout, so it is easy to call an internal > > common function to register or find out a table of callbacks according > > to the function actually called by other modules. > > > > I'm inclined to think to replace EXTNODENAME_MAX_LEN and > > CUSTOM_NAME_MAX_LEN by NAMEDATALEN again, to fit structure layout. > > I don't think that we need both EXTNODENAME_MAX_LEN and > CUSTOM_NAME_MAX_LEN; we can use EXTNODENAME_MAX_LEN for both. I'm > opposed to using NAMEDATALEN for anything unrelated to the size of a > Name. If it's not being stored in a catalog, it doesn't need to care. > OK, I adjusted the v4 patch to use EXTNODENAME_MAX_LEN for both. The structure of hash entry was revised as follows, then registered via an internal common function: RegisterExtensibleNodeEntry, and found out via also an internal common function: GetExtensibleNodeEntry. typedef struct { charextnodename[EXTNODENAME_MAX_LEN]; const void *extnodemethods; } ExtensibleNodeEntry; ExtensibleNodeMethods and CustomScanMethods shall be stored in 'extensible_node_methods' and 'custom_scan_methods' separatedly. The entrypoint functions calls above internal common functions with appropriate HTAB variable. It will be re-usable if we would have further extensible nodes in the future versions. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei pgsql-v9.6-custom-scan-serialization-reworks.5.patch Description: pgsql-v9.6-custom-scan-serialization-reworks.5.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On Mon, Mar 28, 2016 at 8:46 AM, Dilip Kumar wrote: > Found one problem with V15, so sending the new version. > In V15 I am taking prev_blkno as targetBlock instead it should be the last > block of the relation at that time. Attaching new patch. BlockNumber targetBlock, -otherBlock; +otherBlock, +prev_blkno = RelationGetNumberOfBlocks(relation); Absolutely not. There is no way it's acceptable to introduce an unconditional call to RelationGetNumberOfBlocks() into every call to RelationGetBufferForTuple(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On Sun, Mar 27, 2016 at 9:51 PM, Dilip Kumar wrote: > I think this is better option, Since we will search last two pages of FSM > tree, then no need to update the upper level of the FSM tree. Right ? Well, it's less important in that case, but I think it's still worth doing. Some people are going to do just plain GetPageWithFreeSpace(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reworks of CustomScan serialization/deserialization
On Mon, Mar 28, 2016 at 9:36 PM, Kouhei Kaigai wrote: > I don't have a strong reason to keep these stuff in separate files. > Both stuffs covers similar features and amount of code are enough small. > So, the attached v4 just merged custom-node.[ch] stuff into extensible. > > Once we put similar routines closely, it may be better to consolidate > these routines. > As long as EXTNODENAME_MAX_LEN == CUSTOM_NAME_MAX_LEN, both features > have identical structure layout, so it is easy to call an internal > common function to register or find out a table of callbacks according > to the function actually called by other modules. > > I'm inclined to think to replace EXTNODENAME_MAX_LEN and > CUSTOM_NAME_MAX_LEN by NAMEDATALEN again, to fit structure layout. I don't think that we need both EXTNODENAME_MAX_LEN and CUSTOM_NAME_MAX_LEN; we can use EXTNODENAME_MAX_LEN for both. I'm opposed to using NAMEDATALEN for anything unrelated to the size of a Name. If it's not being stored in a catalog, it doesn't need to care. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Only try to push down foreign joins if the user mapping OIDs mat
On Thu, Mar 17, 2016 at 2:26 AM, Ashutosh Bapat wrote: > On Wed, Mar 16, 2016 at 10:22 PM, Tom Lane wrote: >> Robert Haas writes: >> > On Wed, Mar 16, 2016 at 4:10 AM, Ashutosh Bapat < >> > ashutosh.ba...@enterprisedb.com> wrote: >> >> In 9.5, postgres_fdw allowed to prepare statements involving foreign >> >> tables without an associated user mapping as long as planning did not >> >> require the user mapping. Remember, planner would require user mapping >> >> in >> >> case use_remote_estimate is ON for that foreign table. The user mapping >> >> would be certainly required at the time of execution. So executing such >> >> a >> >> prepared statement would throw error. If somebody created a user >> >> mapping >> >> between prepare and execute, execute would not throw an error. A join >> >> can >> >> be pushed down only when user mappings associated with the joining >> >> relations are known and they match. But given the behavior of 9.5 we >> >> should >> >> let the prepare succeed, even if the join can not be pushed down >> >> because of >> >> unknown user mapping. Before this fix, the test was not letting the >> >> prepare >> >> succeed, which is not compliant with 9.5. >> >> > If a query against a single table with no user mapping is legal, I don't >> > see why pushing down a join between two tables neither of which has a >> > user >> > mapping shouldn't also be legal. >> >> The key point here is that Ashutosh is arguing on the basis of the >> behavior of postgres_fdw, which is not representative of all FDWs. >> The core code has no business assuming that all FDWs require user >> mappings; file_fdw is a counterexample. >> > > Here's what the patch is doing. > > The "core" code FDW is given chance to add paths for a join between foreign > relations if they are from the same (valid) server and have same user > mappings, even if the user mapping is invalid. This is code in > build_join_rel(). So, from core code's perspective it's perfectly fine to > push a join down when both sides do not have valid user mapping associated > with them. So, file_fdw is capable of implementing join pushdown even > without having user mapping. > > postgres_fdw code however is different. Rest of the discussion only pertains > to postgres_fdw. The comment in postgres_fdw.sql testcase, to which Robert > objected, is relevant only for postgres_fdw. > > postgres_fdw requires user mapping to execute the query. For base foreign > tables, it's fine not to have a user mapping at the time of planning as long > as planner doesn't need to query the foreign server. But it certainly > requires a user mapping at the time of execution, or else it throws error at > the time of execution. So, Robert's statement "If a query against a single > table with no user mapping is legal" is not entirely correct in the context > of postgres_fdw; postgresGetForeignRelSize(), which is called during > planning, can throw error if it doesn't find a valid user mapping. If a join > between two postgres_fdw foreign relations without valid user mappings is > decided to be pushed down at the time of planning, it will require a user > mapping at the time of execution. This user mapping is derived from the > joining relations recursively. If all of them have same user mapping (even > if invalid) it's fine. If it's invalid, we will throw error at the time of > execution. But if they acquire different user mappings, postgres_fdw can not > fire the join query. It can not use any of the user mappings since that will > compromise security. So, we have landed with a plan which can not be > executed. To be on the safer side, postgres_fdw does not push a join down if > joining sides do not have a valid user mapping. A base foreign relation with > postgres_fdw will require a user mapping, for all serious uses. So, it's not > likely that we will end up with joins between postgres_fdw foreign relations > with invalid user mapping. Makes sense to me. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reworks of CustomScan serialization/deserialization
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > Sent: Friday, March 25, 2016 12:27 AM > To: Petr Jelinek > Cc: Kaigai Kouhei(海外 浩平); pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Reworks of CustomScan serialization/deserialization > > On Wed, Mar 23, 2016 at 1:36 PM, Petr Jelinek wrote: > > Ok, I am happy with it, marked it as ready for committer (it was marked as > > committed although it wasn't committed). > > Thanks for fixing the status. I had forgotten about this thread. > > I can't really endorse the naming conventions here. I mean, we've got > the main extensible nodes stuff in extensible.h, and then we've got > this stuff in custom_node.h (BTW, there is a leftover reference to > custom-node.h). There's no hint in the naming that this relates to > scans, and why is it extensible in one place and custom in another? > > I'm not quite sure how to clean this up. At a minimum, I think we > should standardize on "custom_scan.h" instead of "custom_node.h". I > think that would be clearer. But I'm wondering if we should bite the > bullet and rename everything from "custom" to "extensible" and declare > it all in "extensible.h". > I don't have a strong reason to keep these stuff in separate files. Both stuffs covers similar features and amount of code are enough small. So, the attached v4 just merged custom-node.[ch] stuff into extensible. Once we put similar routines closely, it may be better to consolidate these routines. As long as EXTNODENAME_MAX_LEN == CUSTOM_NAME_MAX_LEN, both features have identical structure layout, so it is easy to call an internal common function to register or find out a table of callbacks according to the function actually called by other modules. I'm inclined to think to replace EXTNODENAME_MAX_LEN and CUSTOM_NAME_MAX_LEN by NAMEDATALEN again, to fit structure layout. > src/backend/nodes/custom_node.c:45: indent with spaces. > +} > Oops, thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei pgsql-v9.6-custom-scan-serialization-reworks.4.patch Description: pgsql-v9.6-custom-scan-serialization-reworks.4.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extend pgbench expressions with functions
* Michael Paquier (michael.paqu...@gmail.com) wrote: > On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas wrote: > > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO wrote: > >> v40 is yet another rebase. > > > > Thanks. Committed after removing an unnecessary parameter from the > > coerceToXXX() functions. > > > > I guess the question here is whether we want the part-c patch, which > > removes the historical \setrandom syntax. That's technically no > > longer needed since we now can use random() as a function directly > > inside \set commands, but we might want it anyway for backward > > compatibility. > > > > Anybody have an opinion on that? > > +1 for nuking it. That's not worth the trouble maintaining it. If we don't nuke it, it'll never die. See also: pg_shadow Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] extend pgbench expressions with functions
On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas wrote: > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO wrote: >> v40 is yet another rebase. > > Thanks. Committed after removing an unnecessary parameter from the > coerceToXXX() functions. > > I guess the question here is whether we want the part-c patch, which > removes the historical \setrandom syntax. That's technically no > longer needed since we now can use random() as a function directly > inside \set commands, but we might want it anyway for backward > compatibility. > > Anybody have an opinion on that? +1 for nuking it. That's not worth the trouble maintaining it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VS 2015 support in src/tools/msvc
On Tue, Mar 29, 2016 at 10:06 AM, Robert Haas wrote: > On Sun, Mar 27, 2016 at 10:35 PM, Michael Paquier > wrote: >> On Fri, Mar 25, 2016 at 9:48 PM, Andrew Dunstan wrote: >>> OK, sounds good. >> >> Just a side-note. Andres has pushed the fix for the GinIs* macros as >> af4472bc, making patch 0003 from the last series useless now. > > I'm not prepared to get very involved in this patch series since I am > not a Windows guy, but I went ahead and pushed 0001 anyway. I'm not > sure that we want to commit 0002 without a BF machine. Can anybody > help with that? I'll try to get a machine up and running, I guess Win7 with VS2015 at this stage. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Sync tzload() and tzparse() APIs with IANA release tzcode2016c.
On Tue, Mar 29, 2016 at 6:19 AM, Tom Lane wrote: > Sync tzload() and tzparse() APIs with IANA release tzcode2016c. > > This brings us a bit closer to matching upstream, but since it affects > files outside src/timezone/, we might choose not to back-patch it. > Hence keep it separate from the main update patch. Buildfarm-not-being-happy-status: woodloose, mastodon, thrips, jacana. http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=woodlouse&dt=2016-03-29%2000%3A42%3A08 The origin of the problem is that, which prevents all the subsequent queries to fail: SET TimeZone to 'UTC'; + ERROR: invalid value for parameter "TimeZone": "UTC" -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VS 2015 support in src/tools/msvc
On Sun, Mar 27, 2016 at 10:35 PM, Michael Paquier wrote: > On Fri, Mar 25, 2016 at 9:48 PM, Andrew Dunstan wrote: >> OK, sounds good. > > Just a side-note. Andres has pushed the fix for the GinIs* macros as > af4472bc, making patch 0003 from the last series useless now. I'm not prepared to get very involved in this patch series since I am not a Windows guy, but I went ahead and pushed 0001 anyway. I'm not sure that we want to commit 0002 without a BF machine. Can anybody help with that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo in comment
On Sat, Mar 26, 2016 at 4:21 PM, Thomas Munro wrote: > Here are a couple of patches to fix a typo in a comment in latch.c: > > - * The memory barrier has be to be placed here to ensure that any flag > + * The memory barrier has to be placed here to ensure that any flag Committed, but it hardly seems worth back-patching. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extend pgbench expressions with functions
On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO wrote: > v40 is yet another rebase. Thanks. Committed after removing an unnecessary parameter from the coerceToXXX() functions. I guess the question here is whether we want the part-c patch, which removes the historical \setrandom syntax. That's technically no longer needed since we now can use random() as a function directly inside \set commands, but we might want it anyway for backward compatibility. Anybody have an opinion on that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Upper planner pathification
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai > Sent: Saturday, March 19, 2016 8:57 AM > To: Tom Lane > Cc: Robert Haas; Petr Jelinek; David Rowley; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] WIP: Upper planner pathification > > > -Original Message- > > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane > > Sent: Friday, March 18, 2016 11:44 PM > > To: Kaigai Kouhei(海外 浩平) > > Cc: Robert Haas; Petr Jelinek; David Rowley; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] WIP: Upper planner pathification > > > > Kouhei Kaigai writes: > > > On Wed, Mar 16, 2016 at 2:47 PM, Tom Lane wrote: > > >> I do not, however, like the proposal to expose wflists and so forth. > > >> Those are internal data structures in grouping_planner and I absolutely > > >> refuse to promise that they're going to stay stable. > > > > > Hmm... It's not easy to imagine a case that extension wants own idea > > > to extract window functions from the target list and select active > > > windows, even if extension wants to have own executor and own cost > > > estimation logic. > > > In case when extension tries to add WindowPath + CustomPath(Sort), > > > extension is interested in alternative sort task, but not window > > > function itself. It is natural to follow the built-in implementation, > > > thus, it motivates extension author to take copy & paste the code. > > > select_active_windows() is static, so extension needs to have same > > > routine on their side. > > > > Well, to be perfectly blunt about it, I have said from day one that this > > notion that a CustomScan extension will be able to cause arbitrary planner > > behavior changes is loony. We are simply not going to drop a hook into > > every tenth line of the planner for you, nor de-static-ify every internal > > function, nor (almost equivalently) expose the data structures those > > functions produce, because it would cripple core planner development to > > try to keep the implied APIs stable. And I continue to maintain that any > > actually-generally-useful ideas would be better handled by submitting them > > as patches to the core planner, rather than trying to implement them in an > > arms-length extension. > > > > In the case at hand, I notice that the WindowFuncLists struct is > > actually from find_window_functions in clauses.c, so an extension > > that needed to get hold of that would be unlikely to do any copying > > and pasting anyhow -- it'd just call find_window_functions again. > > (That only needs to search the targetlist, so it's not that expensive.) > > The other lists you mention are all tightly tied to specific, and not > > terribly well-designed, implementation strategies for grouping sets and > > window functions. Those are *very* likely to change in the near future; > > and even if they don't, I'm skeptical that an external implementor of > > grouping sets or window functions would want to use exactly those same > > implementation strategies. Maybe the only reason you're there at all > > is you want to be smarter about the order of doing window functions, > > for example. > > > > So I really don't want to export any of that stuff. > > > Hmm. I could understand we have active development around this area > thus miscellaneous internal data structure may not be enough stable > to expose the extension. > Don't you deny recall the discussion once implementation gets calmed > down on the future development cycle? > > > As for other details of the proposed patch, I was intending to put > > all the hook calls into grouping_planner for consistency, rather than > > scattering them between grouping_planner and its subroutines. So that > > would probably mean calling the hook for a given step *before* we > > generate the core paths for that step, not after. Did you have a > > reason to want the other order? (If you say "so the hook can look > > at the core-made paths", I'm going to say "that's a bad idea". It'd > > further increase the coupling between a CustomScan extension and core.) > > > No deep reason. I just followed the manner in scan/join path hook; that > calls extension once the core feature adds built-in path nodes. > Ah, I oversight a deep reason. ForeignScan/CustomScan may have an alternative execution path if extension supports corner case handling for a case when these node cannot execute or can execute but unreasonable cost actually, on execution time, even if planner thought the path is most reasonable on optimization time. One example at scan/join was EPQ rechecks. Remote join invocation was not reasonable for each EPQ recheck, thus, ForeignPath had fdw_outerpath field to pick up a core-made path. I can expect some other similar cases. For example, a custom-scan that support multi-node aggregation wants to switch to built-in aggregation if network would be un
Re: [HACKERS] Some messages of pg_rewind --debug not getting translated
On Tue, Mar 29, 2016 at 2:45 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Wed, Mar 23, 2016 at 12:24 AM, Alvaro Herrera >> wrote: >> > Seems reasonable. For the last hunk in your patch, though, I would add >> > a /* translator: */ comment explaining what each of the values is; >> > otherwise it's just incomprehensible percent-sign-salad for the poor >> > translator. Note that if the line is too long you have to use dirty >> > tricks to avoid pgindent from messing it up (see 673b52753489 for >> > example) >> >> OK, done this way. >> >> > I'm not sure if we should keep "blk" in the original or make it a >> > complete word. >> >> OK, a complete word makes more sense. > > Thanks, pushed. Thanks. > If you're interesting in improving translatability of this program > further, I suggest that messages of this sort > msgid "BKPBLOCK_HAS_DATA set, but no data included at %X/%X" > should have a %s instead of the flag name. I'll think about it. Though I am afraid this would reduce the code readability in xlogreader.c -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc
Craig Ringer wrote: > I found a minor issue with the new psql method while writing tests for > failover slots. Patch attached. Thanks, pushed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pthread portability
Michael McConville wrote: > The below diff fixes one problem: you can't compare pthread_t values > directly. Only the function pthread_equal(3) is defined. Direct > comparison usually works because most implementations define pthread_t > as an integer type. So is there a platform where this assumption doesn't hold? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pthread portability
The below diff fixes one problem: you can't compare pthread_t values directly. Only the function pthread_equal(3) is defined. Direct comparison usually works because most implementations define pthread_t as an integer type. Relatedly, INVALID_THREAD is defined as (pthread_t)0. I don't think this is a portable way of checking whether a thread is valid, and I don't know if it's actually possible to get an "invalid" thread out of pthread_create(3) if it succeeds (returns 0). In the cases where you're setting a pthread_t to INVALID_THREAD, maybe using a struct that includes a pthread_t and a 'valid' bool would be preferable. Thanks for your time, Michael diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 4196b0e..f2e5aed 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3791,7 +3791,7 @@ main(int argc, char **argv) { int err = pthread_create(&thread->thread, NULL, threadRun, thread); - if (err != 0 || thread->thread == INVALID_THREAD) + if (err != 0 || pthread_equal(thread->thread, INVALID_THREAD)) { fprintf(stderr, "could not create thread: %s\n", strerror(err)); exit(1); @@ -3819,7 +3819,7 @@ main(int argc, char **argv) TState *thread = &threads[i]; #ifdef ENABLE_THREAD_SAFETY - if (threads[i].thread == INVALID_THREAD) + if (pthread_equal(threads[i].thread, INVALID_THREAD)) /* actually run this thread directly in the main thread */ (void) threadRun(thread); else -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] help for old extension, first_last_agg bug in install with pg9.5
The C implementation is simple, the problem is with the makefile, https://github.com/wulczer/first_last_agg/issues/2 *some clues?* - - - make Makefile:25: /usr/lib/postgresql/9.5/lib/pgxs/src/makefiles/pgxs.mk: No such file or directory make: *** No rule to make target `/usr/lib/postgresql/9.5/lib/pgxs/src/makefiles/pgxs.mk'. Stop. Using UBUNTU 14 LTS See also https://github.com/wulczer/first_last_agg/blob/master/Makefile
Re: [HACKERS] raw output from copy
Pavel Stehule writes: > [ copy-raw-format-20160227-03.patch ] I looked at this patch. I'm having a hard time accepting that it has a use-case large enough to justify it, and here's the reason: it's a protocol break. Conveniently omitting to update protocol.sgml doesn't make it not a protocol break. (libpq.sgml also contains assorted statements that are falsified by this patch.) You could argue that it's the user's own fault if he tries to use COPY RAW with client-side code that hasn't been updated to support it. Maybe that's okay, but I wonder if we're opening ourselves up to problems. Maybe even security-grade problems. In terms of specific code that hasn't been updated, ecpg is broken by this patch, and I'm not very sure what libpq's PQbinaryTuples() ought to do but probably something other than what it does today. There's also a definitional question of what we think PQfformat() ought to do; should it return "2" for the per-field format? Or maybe the per-field format is still "1", since it's after all the same binary data format as for COPY BINARY, and only the overall copy format reported by PQbinaryTuples() should change to "2". BTW, I'm not really sure why the patch is trying to enforce single row and column for the COPY OUT case. I thought the idea for that was that we'd just shove out the data without any delimiters, and if it's more than one datum it's the user's problem whether he can identify the boundaries. On the input side we would have to insist on one column since we're not going to attempt to identify boundaries (and one row would fall out of the fact that we slurp the entire input and treat it as one datum). Anyway this is certainly not committable as-is, so I'm setting it back to Waiting on Author. But the fact that both libpq and ecpg would need updates makes me question whether we can safely pretend that this isn't a protocol break. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN is missing in multicolumn indexes documentation
Petr Jediný wrote: > Comments? This is my first patch to postgresql project, so comments > are very much welcomed. I think it's fine, so I pushed it. Thanks. I hope you already read this https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On 28/03/16 14:46, Dilip Kumar wrote: Conclusion: --- 1. I think v15 is solving the problem exist with v13 and performance is significantly high compared to base, and relation size is also stable, So IMHO V15 is winner over other solution, what other thinks ? It seems so, do you have ability to reasonably test with 64 clients? I am mostly wondering if we see the performance going further down or just plateau. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dealing with collation and strcoll/strxfrm/etc
On Mon, Mar 28, 2016 at 12:36 PM, Stephen Frost wrote: > Having to figure out how each and every stdlib does versioning doesn't > sound fun, I certainly agree with you there, but it hardly seems > impossible. What we need, even if we look to move to ICU, is a place to > remember that version information and a way to do something when we > discover that we're now using a different version. I think that the versioning situation is all over the place. It isn't in the C standard. And there are many different versions of many different stdlibs to support. Most importantly, where support nominally exists, a strong incentive to get it exactly right may not. We've seen that already. > I'm not quite sure what the best way to do that is, but I imagine it > involves changes to existing catalogs or perhaps even a new one. I > don't have any particularly great ideas for existing releases (maybe > stash information in the index somewhere when it's rebuilt and then > check it and throw an ERROR if they don't match?) I think we'd need to introduce an abstraction like a "collation provider", of which ICU would theoretically be just one. The OS would be a baked-in collation provider. Everything that works today would continue to work. We'd then largely just be grandfathering out systems that rely on OS locales across major version upgrades, since the vast majority of users are happy with Unicode, and have no cultural or technical reason to prefer the OS locales that I can think of. I am unconvinced with the idea that it especially matters that sort(1) might not be in agreement with Postgres. Neither is any Java app, or any .Net app, or the user's web browser in the case of Safari or Google Chrome (maybe others). I want Postgres to be consistent with Postgres, across different nodes on the network, in environments where I may have little knowledge of the underlying OS. Think "sort pushdown in postgres_fdw". Users from certain East Asian user communities might prefer to stick with regional encodings, perhaps due to specific concerns about the Han Unification controversy. But I'm pretty sure that these users have very low expectations about collations in Postgres today. I was recently told that collating Japanese is starting to get a bit better, due to various new initiatives, but that most experienced Japanese Postgres DBAs tend to use the "C" collation. I don't want to impose a Unicode monoculture on anyone. But I do think there are clear benefits for the large majority of users that always use Unicode. Nothing needs to break that works today to make this happen. Abbreviated keys provide an immediate incentive for users to adopt ICU; users that might otherwise be on the fence about it. >> The question is only how we deal with this when it happens. One thing >> that's attractive about ICU is that it makes this explicit, both for >> the logical behavior of a collation, as well as the stability of >> binary sort keys (Glibc's versioning seemingly just does the former). >> So the equivalent of strxfrm() output has license to change for >> technical reasons that are orthogonal to the practical concerns of >> end-users about how text sorts in their locale. ICU is clear on what >> it takes to make binary sort keys in indexes work. And various major >> database systems rely on this being right. > > There seems to be some disagreement about if ICU provides the > information we'd need to make a decision or not. It seems like it > would, given its usage in other database systems, but if so, we need to > very clearly understand exactly how it works and how we can depend on > it. It seems likely that it exposes the information required to make what we need to do practical. Certainly, adopting ICU is a big project that we should proceed cautiously with, but there is a reason why every other major database system uses either ICU, or a library based on UCA [1] that allows the system to centrally control versioned collations (SQLite just makes this optional). I think that ICU *could* still tie us to the available collations on an OS (those collations that are available with their ICU packages). What I haven't figured out yet is if it's practical to install versions that are available from some central location, like the CLDR [2]. I don't think we'd want to have Postgres ship "supported collations" in each major version, in roughly the style of the IANA timezone stuff, but it's far too early to rule that out. It would have upsides. [1] https://en.wikipedia.org/wiki/Unicode_collation_algorithm [2] http://cldr.unicode.org/ -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SQL access to access method details
While working on a tool to capture catalog/stats info and looking at cross version compatibility, I noticed that the pg_am changes removed SQL access to a bunch of AM info. [1] indicates that's part of the purpose of the patch; are we sure no tools are using this info? Unlike previous catalog compatibility breaks, this one completely removes that information, so if someone was using it they're now completely hosed. [1] http://www.postgresql.org/message-id/55fec1ab.8010...@2ndquadrant.com -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dealing with collation and strcoll/strxfrm/etc
* Peter Geoghegan (p...@heroku.com) wrote: > On Mon, Mar 28, 2016 at 7:57 AM, Stephen Frost wrote: > > If we're going to talk about minimum requirements, I'd like to argue > > that we require whatever system we're using to have versioning (which > > glibc currently lacks, as I understand it...) to avoid the risk that > > indexes will become corrupt when whatever we're using for collation > > changes. I'm pretty sure that's already bitten us on at least some > > RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues > > with strcoll vs. strxfrm. > > I totally agree that anything we should adopt should support > versioning. Glibc does have a non-standard versioning scheme, but we > don't use it. Other stdlibs may do versioning another way, or not at > all. A world in which ICU is the defacto standard for Postgres (i.e. > the actual standard on all major platforms), we mostly just have one > thing to target, which seems like something to aim for. Having to figure out how each and every stdlib does versioning doesn't sound fun, I certainly agree with you there, but it hardly seems impossible. What we need, even if we look to move to ICU, is a place to remember that version information and a way to do something when we discover that we're now using a different version. I'm not quite sure what the best way to do that is, but I imagine it involves changes to existing catalogs or perhaps even a new one. I don't have any particularly great ideas for existing releases (maybe stash information in the index somewhere when it's rebuilt and then check it and throw an ERROR if they don't match?) > The question is only how we deal with this when it happens. One thing > that's attractive about ICU is that it makes this explicit, both for > the logical behavior of a collation, as well as the stability of > binary sort keys (Glibc's versioning seemingly just does the former). > So the equivalent of strxfrm() output has license to change for > technical reasons that are orthogonal to the practical concerns of > end-users about how text sorts in their locale. ICU is clear on what > it takes to make binary sort keys in indexes work. And various major > database systems rely on this being right. There seems to be some disagreement about if ICU provides the information we'd need to make a decision or not. It seems like it would, given its usage in other database systems, but if so, we need to very clearly understand exactly how it works and how we can depend on it. > > Regarding key abbreviation and performance, if we are confident that > > strcoll and strxfrm are at least independently internally consistent > > then we could consider offering an option to choose between them. > > I think they just need to match, per the standard. After all, > abbreviation will sometimes require strcoll() tie-breakers. Ok, I didn't see that in the man-pages. If that's the case then it seems like there isn't much hope of just using strxfrm(). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Sequence Access Method WIP
On Thu, Mar 24, 2016 at 6:12 PM, Petr Jelinek wrote: > > Hi, > > I rebased this on top of the recently committed CREATE ACCESS METHOD. > Hi, I got the above error trying to apply to the current master: $ git apply /home/fabrizio/Downloads/0001-seqam-2016-03-24.patch error: patch failed: src/backend/commands/amcmds.c:29 error: src/backend/commands/amcmds.c: patch does not apply There are a wrong definition at the beginning of the amcmds.c: 34 <<< ours 35 static Oid lookup_index_am_handler_func(List *handler_name, char amtype); 36 static const char *get_am_type_string(char amtype); 37 === 38 static Oid lookup_am_handler_func(List *handler_name, char amtype); 39 static char *get_am_type_string(char amtype); 40 >>> theirs After this small fix I can build and ran regress tests without errors. But running "check-world" I got the error: make[1]: Leaving directory `/data/postgresql/src/test/regress' make: Leaving directory `/data/postgresql' + pg_dumpall -f /data/postgresql/src/bin/pg_upgrade/tmp_check/dump1.sql ok 9 - dropuser foobar1 exit code 0 ok 10 - SQL DROP ROLE run: SQL found in server log ok 11 - fails with nonexistent user ok t/080_pg_isready.pl ... 1..10 ok 1 - pg_isready --help exit code 0 ok 2 - pg_isready --help goes to stdout ok 3 - pg_isready --help nothing to stderr ok 4 - pg_isready --version exit code 0 ok 5 - pg_isready --version goes to stdout ok 6 - pg_isready --version nothing to stderr ok 7 - pg_isready with invalid option nonzero exit code ok 8 - pg_isready with invalid option prints error message ok 9 - fails with no server running pg_dump: [archiver (db)] query failed: ERROR: column "sequence_name" does not exist LINE 1: SELECT sequence_name, start_value, increment_by, CASE WHEN i... ^ pg_dump: [archiver (db)] query was: SELECT sequence_name, start_value, increment_by, CASE WHEN increment_by > 0 AND max_value = 9223372036854775807 THEN NULL WHEN increment_by < 0 AND max_value = -1 THEN NULL ELSE max_value END AS max_value, CASE WHEN increment_by > 0 AND min_value = 1 THEN NULL WHEN increment_by < 0 AND min_value = -9223372036854775807 THEN NULL ELSE min_value END AS min_value, cache_value, is_cycled FROM check_seq pg_dumpall: pg_dump failed on database "regression", exiting + pg_dumpall1_status=1 + [ /data/postgresql != /data/postgresql ] + /data/postgresql/src/bin/pg_upgrade/tmp_check/install//home/fabrizio/pgsql/bin/pg_ctl -m fast stop waiting for server to shut down done server stopped + [ -n ] + [ -n ] + [ -n 1 ] + echo pg_dumpall of pre-upgrade database cluster failed pg_dumpall of pre-upgrade database cluster failed + exit 1 + rm -rf /tmp/pg_upgrade_check-3NUa0X make[2]: *** [check] Error 1 make[2]: Leaving directory `/data/postgresql/src/bin/pg_upgrade' make[1]: *** [check-pg_upgrade-recurse] Error 2 make[1]: *** Waiting for unfinished jobs And testing pg_dump itself I got the same error trying to dump a database that contains a sequence. fabrizio=# create sequence x; CREATE SEQUENCE fabrizio=# \ds List of relations Schema | Name | Type | Owner +--+--+-- public | x| sequence | fabrizio (1 row) fabrizio=# \d x Sequence "public.x" Column| Type|Value --+---+- start_value | bigint| 1 increment_by | bigint| 1 max_value| bigint| 9223372036854775807 min_value| bigint| 1 cache_value | bigint| 1 is_cycled| boolean | f amstate | seqam_local_state | (1,f,0) Access Method: local fabrizio=# \q fabrizio@bagual:~/pgsql $ bin/pg_dump > /tmp/fabrizio.sql pg_dump: [archiver (db)] query failed: ERROR: column "sequence_name" does not exist LINE 1: SELECT sequence_name, start_value, increment_by, CASE WHEN i... ^ pg_dump: [archiver (db)] query was: SELECT sequence_name, start_value, increment_by, CASE WHEN increment_by > 0 AND max_value = 9223372036854775807 THEN NULL WHEN increment_by < 0 AND max_value = -1 THEN NULL ELSE max_value END AS max_value, CASE WHEN increment_by > 0 AND min_value = 1 THEN NULL WHEN increment_by < 0 AND min_value = -9223372036854775807 THEN NULL ELSE min_value END AS min_value, cache_value, is_cycled FROM x Toninght I'll review some parts of the code. Regards, Att, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog: http://fabriziomello.github.io >> Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello >> Github: http://github.com/fabriziomello
Re: [HACKERS] Dealing with collation and strcoll/strxfrm/etc
On Mon, Mar 28, 2016 at 7:57 AM, Stephen Frost wrote: > If we're going to talk about minimum requirements, I'd like to argue > that we require whatever system we're using to have versioning (which > glibc currently lacks, as I understand it...) to avoid the risk that > indexes will become corrupt when whatever we're using for collation > changes. I'm pretty sure that's already bitten us on at least some > RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues > with strcoll vs. strxfrm. I totally agree that anything we should adopt should support versioning. Glibc does have a non-standard versioning scheme, but we don't use it. Other stdlibs may do versioning another way, or not at all. A world in which ICU is the defacto standard for Postgres (i.e. the actual standard on all major platforms), we mostly just have one thing to target, which seems like something to aim for. Collations change from time to time, legitimately. Read from "Collation order is not fixed", here: http://unicode.org/reports/tr10/#Stability The question is only how we deal with this when it happens. One thing that's attractive about ICU is that it makes this explicit, both for the logical behavior of a collation, as well as the stability of binary sort keys (Glibc's versioning seemingly just does the former). So the equivalent of strxfrm() output has license to change for technical reasons that are orthogonal to the practical concerns of end-users about how text sorts in their locale. ICU is clear on what it takes to make binary sort keys in indexes work. And various major database systems rely on this being right. > Regarding key abbreviation and performance, if we are confident that > strcoll and strxfrm are at least independently internally consistent > then we could consider offering an option to choose between them. I think they just need to match, per the standard. After all, abbreviation will sometimes require strcoll() tie-breakers. Clearly it would be very naive to imagine that ICU is bug-free. However, I surmise that there is a large difference how ICU and glibc think about things like strxfrm() or strcoll() stability and consistency. Tom was able to demonstrate that strxfrm() and strcoll() behaved inconsistently without too much effort, contrary to POSIX, and in many common cases. I doubt that the Glibc maintainers are all that concerned about it. Certainly, less concerned than they are about the latest security bug. Whereas if this happened in ICU, it would be a total failure of the project to fulfill its most basic goals. Our disaster would also be a disaster for several other major database systems. ICU carefully and explicitly considers multiple forms of stability, "deterministic" sort ordering, etc. That *is* a big difference, and it makes me optimistic that there'd be far fewer problems. I also think that ICU could be a reasonable basis for case-insensitive collations, which would let us kill citext, a module that I consider to be a total kludge. And, we might also be able to lock down WAL compatibility, which would be generally useful. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Client Log Output Filtering
David Steele writes: > On 3/10/16 11:00 AM, Petr Jelinek wrote: >> The comment above errhidefromclient says "Only log levels below ERROR >> can be hidden from the client." but use of the errhidefromclient(true) >> actually does hide the error message from client, client just gets >> failed query without any message when used with ERROR level. > Right you are. The v3 patch adds this check. > I also added the documentation to sources.sgml that Tom pointed out was > missing. I set to work on committing this, but was soon rather dissatisfied with it, because as-implemented there is no way to short-circuit elog.c's processing if the message is not to be sent to either the client or the postmaster log. Ideally this would be taken into account when errstart() figures the output_to_client setting to begin with --- but of course we can't do that with this API, because errhidefromclient() hasn't run yet. The patch is buggy even without that consideration, because it would potentially disable client output of messages even after they've been promoted to FATAL by pg_re_throw. We could fix that by clearing the flag in pg_re_throw, but I think that's just another symptom of the shutoff being done in the wrong place. I wonder just what the expected usage scenario is for this function, and whether it would not be better addressed by inventing some other API rather than modeling it on errhidestmt(), which is by no means the same kind of thing. One idea is to invent a new elevel which is never sent to the client --- analogously to COMMERROR, though probably much higher priority than that, maybe the same priority as LOG. If there actually is a use for a range of elevels on errhidefromclient'd messages, that wouldn't work very well of course. Or we could consider having a flag bit that is OR'd into the elevel, along the lines of ereport(LOG | ERR_HIDE_FROM_CLIENT, (errmsg())); so that the information is available at errstart time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some messages of pg_rewind --debug not getting translated
Michael Paquier wrote: > On Wed, Mar 23, 2016 at 12:24 AM, Alvaro Herrera > wrote: > > Seems reasonable. For the last hunk in your patch, though, I would add > > a /* translator: */ comment explaining what each of the values is; > > otherwise it's just incomprehensible percent-sign-salad for the poor > > translator. Note that if the line is too long you have to use dirty > > tricks to avoid pgindent from messing it up (see 673b52753489 for > > example) > > OK, done this way. > > > I'm not sure if we should keep "blk" in the original or make it a > > complete word. > > OK, a complete word makes more sense. Thanks, pushed. If you're interesting in improving translatability of this program further, I suggest that messages of this sort msgid "BKPBLOCK_HAS_DATA set, but no data included at %X/%X" should have a %s instead of the flag name. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
On Thu, Mar 24, 2016 at 7:12 PM, Peter Geoghegan wrote: > On Thu, Mar 24, 2016 at 7:17 AM, Robert Haas wrote: >> I really like this idea, and the performance results seem impressive, >> but I think we should push this out to 9.7. A btree patch that didn't >> have WAL support until two and a half weeks into the final CommitFest >> just doesn't seem to me like a good candidate. First, as a general >> matter, if a patch isn't code-complete at the start of a CommitFest, >> it's reasonable to say that it should be reviewed but not necessarily >> committed in that CommitFest. This patch has had some review, but I'm >> not sure how deep that review is, and I think it's had no code review >> at all of the WAL logging changes, which were submitted only a week >> ago, well after the CF deadline. Second, the btree AM is a >> particularly poor place to introduce possibly destabilizing changes. >> Everybody depends on it, all the time, for everything. And despite >> new tools like amcheck, it's not a particularly easy thing to debug. > > Regrettably, I must agree. I don't see a plausible path to commit for > this patch in the ongoing CF. > > I think that Anastasia did an excellent job here, and I wish I could > have been of greater help sooner. Nevertheless, it would be unwise to > commit this given the maturity of the code. There have been very few > instances of performance improvements to the B-Tree code for as long > as I've been interested, because it's so hard, and the standard is so > high. The only example I can think of from the last few years is > Kevin's commit 2ed5b87f96 and Tom's commit 1a77f8b63d both of which > were far less invasive, and Simon's commit c7111d11b1, which we just > outright reverted from 9.5 due to subtle bugs (and even that was > significantly less invasive than this patch). Improving nbtree is > something that requires several rounds of expert review, and that's > something that's in short supply for the B-Tree code in particular. I > think that a new testing strategy is needed to make this easier, and I > hope to get that going with amcheck. I need help with formalizing a > "testing first" approach for improving the B-Tree code, because I > think it's the only way that we can move forward with projects like > this. It's *incredibly* hard to push forward patches like this given > our current, limited testing strategy. I've been toying (having gotten nowhere concrete really) with prefix compression myself, I agree that messing with btree code is quite harder than it ought to be. Perhaps trying experimental format changes in a separate experimental am wouldn't be all that bad (say, nxbtree?). People could opt-in to those, by creating the indexes with nxbtree instead of plain btree (say in development environments) and get some testing going without risking much. Normally the same effect should be achievable with mere flags, but since format changes to btree tend to be rather invasive, ensuring the patch doesn't change behavior with the flag off is hard as well, hence the wholly separate am idea. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
Alexander Korotkov wrote: > On Thu, Mar 24, 2016 at 6:19 PM, Alvaro Herrera > wrote: > > > .. Oh crap. I just noticed I forgot to update a comment in pg_dump's > > getAccessMethods. And we're missing psql tab-complete support for the > > new commands. > > Attached patches fix both these issues. I see Teodor just pushed these two fixes. Thanks! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
On Fri, Sep 11, 2015 at 8:01 PM, Amit Kapila wrote: > > On Thu, Sep 3, 2015 at 5:11 PM, Andres Freund wrote: > > > > Updated comments and the patch (increate_clog_bufs_v2.patch) > containing the same is attached. > Andres mentioned to me in off-list discussion, that he thinks we should first try to fix the clog buffers problem as he sees in his tests that clog buffer replacement is one of the bottlenecks. He also suggested me a test to see if the increase in buffers could lead to regression. The basic idea of test was to ensure every access on clog access to be a disk one. Based on his suggestion, I have written a SQL statement which will allow every access of CLOG to be a disk access and the query used for same is as below: With ins AS (INSERT INTO test_clog_access values(default) RETURNING c1) Select * from test_clog_access where c1 = (Select c1 from ins) - 32768 * :client_id; Test Results - HEAD - commit d12e5bb7 Clog Buffers - 32 Patch-1 - Clog Buffers - 64 Patch-2 - Clog Buffers - 128 Patch_Ver/Client_Count 1 64 HEAD 12677 57470 Patch-1 12305 58079 Patch-2 12761 58637 Above data is a median of 3 10-min runs. Above data indicates that there is no substantial dip in increasing clog buffers. Test scripts used in testing are attached with this mail. In perf_clog_access.sh, you need to change data_directory path as per your m/c, also you might want to change the binary name, if you want to create postgres binaries with different names. Andres, Is this test inline with what you have in mind? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com access_clog_prep.sql Description: Binary data access_clog.sql Description: Binary data perf_clog_access.sh Description: Bourne shell script -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN is missing in multicolumn indexes documentation
Comments? This is my first patch to postgresql project, so comments are very much welcomed. PJ On Wed, Mar 23, 2016 at 8:40 PM, Petr Jediný wrote: >>> >>> Multicolumn BRIN is like GIN. Every column is indexed separately. >>> The order of the columns doesn't matter. >> >> Right. You can use one index to cover all columns; the position of the >> column in the index won't matter for a query that uses one column. The >> only reason to have multiple BRIN indexes on a single table is to have >> a different pages_per_range. > > Thanks for the information, I've attached the patch updated to v2. > > > PJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Queries and PostGIS
On Mon, Mar 28, 2016 at 9:45 AM, Stephen Frost wrote: > Paul, > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: >> I spent some time over the weekend trying out the different modes of >> parallel query (seq scan, aggregate, join) in combination with PostGIS >> and have written up the results here: >> >> http://blog.cleverelephant.ca/2016/03/parallel-postgis.html > > Neat! > > Regarding aggregate parallelism and the cascaded union approach, though > I imagine in other cases as well, it seems like having a > "final-per-worker" function for aggregates would be useful. > > Without actually looking at the code at all, it seems like that wouldn't > be terribly difficult to add. > > Would you agree that it'd be helpful to have for making the st_union() > work better in parallel? For our particular situation w/ ST_Union, yes, it would be ideal to be able to run a worker-side combine function as well as the master-side one. Although the cascaded union would be less effective spread out over N nodes, doing it only once per worker, rather than every N records would minimize the loss of effectiveness. P -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Queries and PostGIS
Paul, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > I spent some time over the weekend trying out the different modes of > parallel query (seq scan, aggregate, join) in combination with PostGIS > and have written up the results here: > > http://blog.cleverelephant.ca/2016/03/parallel-postgis.html Neat! Regarding aggregate parallelism and the cascaded union approach, though I imagine in other cases as well, it seems like having a "final-per-worker" function for aggregates would be useful. Without actually looking at the code at all, it seems like that wouldn't be terribly difficult to add. Would you agree that it'd be helpful to have for making the st_union() work better in parallel? Though I do wonder if you would end up wanting to have a different final() function in that case.. Thanks! Stephen signature.asc Description: Digital signature
[HACKERS] Parallel Queries and PostGIS
I spent some time over the weekend trying out the different modes of parallel query (seq scan, aggregate, join) in combination with PostGIS and have written up the results here: http://blog.cleverelephant.ca/2016/03/parallel-postgis.html The TL:DR; is basically * With some adjustments to function COST both parallel sequence scan and parallel aggregation deliver very good parallel performance results. * The cost adjustments for sequence scan and aggregate scan are not consistent in magnitude. * Parallel join does not seem to work for PostGIS indexes yet, but perhaps there is some magic to learn from PostgreSQL core on that. The two findings at the end are ones that need input from parallel query masters... We recognize we'll have to adjust costs to that our particular use case (very CPU-intensive calculation per function) is planned better, but it seems like different query modes are interpreting costs in order-of-magnitude different ways in building plans. Parallel join would be a huge win, so some help/pointers on figuring out why it's not coming into play when our gist operators are in effect would be helpful. Happy Easter to you all, P -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup tools ought to ensure created backups are durable
On Mon, Mar 28, 2016 at 3:12 PM, Andres Freund wrote: > On 2016-03-28 11:35:57 +0200, Magnus Hagander wrote: > > On Mon, Mar 28, 2016 at 3:11 AM, Michael Paquier < > michael.paqu...@gmail.com> > > wrote: > > > > > On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund > wrote: > > > > As pointed out in > > > > > > > > http://www.postgresql.org/message-id/20160327232509.v5wgac5vskuse...@awork2.anarazel.de > > > > our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't > > > > make any efforts to ensure their output is durable. > > > > > > > > I think for backup tools of possibly critical data, that's pretty > much > > > > unaceptable. > > > > > > Definitely agreed, once a backup/dump has been taken and those > > > utilities exit, we had better ensure that they are durably on disk. > > > For pg_basebackup and pg_dump, as everything except pg_dump/plain > > > require a target directory for the location of the output result, we > > > really can make things better. > > > > > > > > Definitely agreed on fixing it. But I don't think your summary is right. > > > > pg_basebackup in tar mode can be sent to stdout, does not require a > > directory. And the same for pg_dump in any mode except for directory. So > we > > can't just drive it off the mode, some more detailed checks are required. > > if (!isastty(stdout)) ought to do the trick, afaics? And maybe add a > warning somewhere in the docs about the tools not fsyncing if you pipe > their output data somewhere? > That should work yeah. And given that we already use that check in other places, it seems it should be perfectly safe. And as long as we only do a WARNING and not abort if the fsync fails, we should be OK if people intentionally store their backups on an fs that doesn't speak fsync (if that exists), in which case I don't really think we even need a switch to turn it off. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] multivariate statistics v14
Tomas Vondra wrote: > I'm not sure about the prototypes though. It was a bit weird because > prototypes in the same header file were formatted very differently. Yeah, it is very odd. What happens is that the BSD indent binary does one thing (return type is in one line and function name in following line; subsequent argument lines are aligned to opening parens), then the pgindent perl script changes it (moves function name to same line as return type, but does not reindent subsequent lines of arguments). You can imitate the effect by adding an extra newline just before the function name, reflowing the arguments to align to the (, then deleting the extra newline. Rather annoying. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics
Andres Freund wrote: > On 2016-03-28 15:46:43 +0300, Alexander Korotkov wrote: > > @@ -932,8 +936,13 @@ ReadBuffer_common(SMgrRelation smgr, cha > > > > if (isLocalBuf) > > { > > - /* Only need to adjust flags */ > > - bufHdr->flags |= BM_VALID; > > + /* > > +* Only need to adjust flags. Since it's local buffer, there > > is no > > +* concurrency. We assume read/write pair to be cheaper than > > atomic OR. > > +*/ > > + uint32 state = pg_atomic_read_u32(&bufHdr->state); > > + state |= BM_VALID; > > + pg_atomic_write_u32(&bufHdr->state, state); > > I'm not a fan of repeating that comment in multiple places. Hm. Perhaps a set of macros should be offered that do "read, set some flag, write". Then the comments can be attached to the macro instead of to each callsite. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
On Mon, Mar 28, 2016 at 10:24 AM, Tom Lane wrote: > I'm also not exactly convinced by your implicit assumption that ICU is > bug-free. Noah spent some time looking at ICU back when he was EnterpriseDB, and his conclusion was that ICU collations weren't stable across releases, which is pretty much the same problem we're running into with glibc collations. Now it might still be true that they have the equivalent of strxfrm() and strcoll() and that those things behave consistently with each other, and that would be very good. Everybody seems to agree it's faster, and that's good, too. But I wonder what we do about the fact that, as with glibc, an ICU upgrade involves a REINDEX of every potentially affected index. It seems like ICU has some facilities built into it that might be useful for detecting and handling such situations, but I don't understand them well enough to know whether they'd solve our versioning problems or how effectively they would do so, and I think there are packaging issues that tie into it, too. http://userguide.icu-project.org/design mentions building with specific configure flags if you need to link with multiple server versions, and I don't know what operating system packagers typically do about that stuff. In any case, I agree that we'd be very unwise to think that ICU is necessarily going to be bug-free without testing it carefully. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Dealing with collation and strcoll/strxfrm/etc
All, Changed the thread name (we're no longer talking about release notes...). * Tom Lane (t...@sss.pgh.pa.us) wrote: > Oleg Bartunov writes: > > Should we start thinking about ICU ? > > Isn't it still true that ICU fails to meet our minimum requirements? > That would include (a) working with the full Unicode character range > (not only UTF16) and (b) working with non-Unicode encodings. No doubt > we could deal with (b) by inserting a conversion, but that would take > a lot of shine off the performance numbers you mention. > > I'm also not exactly convinced by your implicit assumption that ICU is > bug-free. We have a wiki page about ICU. I'm not sure that it's current, but if it isn't and people are interested then perhaps we should update it: https://wiki.postgresql.org/wiki/Todo:ICU If we're going to talk about minimum requirements, I'd like to argue that we require whatever system we're using to have versioning (which glibc currently lacks, as I understand it...) to avoid the risk that indexes will become corrupt when whatever we're using for collation changes. I'm pretty sure that's already bitten us on at least some RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues with strcoll vs. strxfrm. Regarding key abbreviation and performance, if we are confident that strcoll and strxfrm are at least independently internally consistent then we could consider offering an option to choose between them. We'd need to identify what each index was built with to do so, however, as they would need to be rebuilt if the choice changes, at least until/unless they're made to reliably agree. Even using only one or the other doesn't address the versioning problem though, which is a problem for all currently released versions of PG and is just going to continue to be an issue. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
25.03.2016 01:12, Peter Geoghegan: On Thu, Mar 24, 2016 at 7:17 AM, Robert Haas wrote: I really like this idea, and the performance results seem impressive, but I think we should push this out to 9.7. A btree patch that didn't have WAL support until two and a half weeks into the final CommitFest just doesn't seem to me like a good candidate. First, as a general matter, if a patch isn't code-complete at the start of a CommitFest, it's reasonable to say that it should be reviewed but not necessarily committed in that CommitFest. You're right. Frankly, I thought that someone will help me with the path, but I had to finish it myself. *off-topic* I wonder, if we can add new flag to commitfest. Something like "Needs assistance", which will be used to mark big and complicated patches in progress. While "Needs review" means that the patch is almost ready and only requires the final review. This patch has had some review, but I'm not sure how deep that review is, and I think it's had no code review at all of the WAL logging changes, which were submitted only a week ago, well after the CF deadline. Second, the btree AM is a particularly poor place to introduce possibly destabilizing changes. Everybody depends on it, all the time, for everything. And despite new tools like amcheck, it's not a particularly easy thing to debug. Regrettably, I must agree. I don't see a plausible path to commit for this patch in the ongoing CF. I think that Anastasia did an excellent job here, and I wish I could have been of greater help sooner. Nevertheless, it would be unwise to commit this given the maturity of the code. There have been very few instances of performance improvements to the B-Tree code for as long as I've been interested, because it's so hard, and the standard is so high. The only example I can think of from the last few years is Kevin's commit 2ed5b87f96 and Tom's commit 1a77f8b63d both of which were far less invasive, and Simon's commit c7111d11b1, which we just outright reverted from 9.5 due to subtle bugs (and even that was significantly less invasive than this patch). Improving nbtree is something that requires several rounds of expert review, and that's something that's in short supply for the B-Tree code in particular. I think that a new testing strategy is needed to make this easier, and I hope to get that going with amcheck. I need help with formalizing a "testing first" approach for improving the B-Tree code, because I think it's the only way that we can move forward with projects like this. It's *incredibly* hard to push forward patches like this given our current, limited testing strategy. Unfortunately, I must agree. This patch seems to be far from final version until the feature freeze. I'll move it to the future commitfest. Anyway it means, that now we have more time to improve the patch. If you have any ideas related to this patch like prefix/suffix compression, I'll be glad to discuss them. Same for any other ideas of B-tree optimization. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
Oleg Bartunov writes: > Should we start thinking about ICU ? Isn't it still true that ICU fails to meet our minimum requirements? That would include (a) working with the full Unicode character range (not only UTF16) and (b) working with non-Unicode encodings. No doubt we could deal with (b) by inserting a conversion, but that would take a lot of shine off the performance numbers you mention. I'm also not exactly convinced by your implicit assumption that ICU is bug-free. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Combinations of pg_strdup/free in pg_dump code
Michael Paquier writes: > While reading some code of pg_dump, I noticed that the following > pattern is heavily present: > lanname = pg_strdup(stuff) > free(lanname); > When pg_strdup or any pg-related allocation routines are called, I > think that we should use pg_free() and not free(). It does not matter > much in practice because pg_free() calls actually free() and the > latter per the POSIX spec should do nothing if the input pointer is > NULL (some version of SunOS that crash on that actually :p), but we > really had better be consistent in the calls done. Thoughts? I do not think this is worth troubling over, really. If there are places that are relying on free(NULL) to work, it might be worth ensuring they go through pg_free; but the pattern you show here is perfectly safe. We have other things to do besides create code churn for this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics
On 2016-03-28 15:46:43 +0300, Alexander Korotkov wrote: > diff --git a/src/backend/storage/buffer/bufmnew file mode 100644 > index 6dd7c6e..fe6fb9c > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -52,7 +52,6 @@ > #include "utils/resowner_private.h" > #include "utils/timestamp.h" > > - > /* Note: these two macros only work on shared buffers, not local ones! */ > #define BufHdrGetBlock(bufHdr) ((Block) (BufferBlocks + ((Size) > (bufHdr)->buf_id) * BLCKSZ)) > #define BufferGetLSN(bufHdr) (PageGetLSN(BufHdrGetBlock(bufHdr))) spurious > @@ -848,7 +852,7 @@ ReadBuffer_common(SMgrRelation smgr, cha >* it's not been recycled) but come right back here to try smgrextend >* again. >*/ > - Assert(!(bufHdr->flags & BM_VALID));/* spinlock not needed > */ > + Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID)); /* header > lock not needed */ > > bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : > BufHdrGetBlock(bufHdr); > > @@ -932,8 +936,13 @@ ReadBuffer_common(SMgrRelation smgr, cha > > if (isLocalBuf) > { > - /* Only need to adjust flags */ > - bufHdr->flags |= BM_VALID; > + /* > + * Only need to adjust flags. Since it's local buffer, there > is no > + * concurrency. We assume read/write pair to be cheaper than > atomic OR. > + */ > + uint32 state = pg_atomic_read_u32(&bufHdr->state); > + state |= BM_VALID; > + pg_atomic_write_u32(&bufHdr->state, state); I'm not a fan of repeating that comment in multiple places. Hm. > } > else > { > @@ -987,10 +996,11 @@ BufferAlloc(SMgrRelation smgr, char relp > BufferTag oldTag; /* previous identity of > selected buffer */ > uint32 oldHash;/* hash value for oldTag */ > LWLock *oldPartitionLock; /* buffer partition lock for it > */ > - BufFlagsoldFlags; > + uint32 oldFlags; > int buf_id; > BufferDesc *buf; > boolvalid; > + uint32 state; > > /* create a tag so we can lookup the buffer */ > INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum); > @@ -1050,23 +1060,23 @@ BufferAlloc(SMgrRelation smgr, char relp > for (;;) > { > /* > - * Ensure, while the spinlock's not yet held, that there's a > free > + * Ensure, while the header lock isn't yet held, that there's a > free >* refcount entry. >*/ > ReservePrivateRefCountEntry(); > > /* >* Select a victim buffer. The buffer is returned with its > header > - * spinlock still held! > + * lock still held! >*/ > - buf = StrategyGetBuffer(strategy); > + buf = StrategyGetBuffer(strategy, &state); The new thing really still is a spinlock, not sure if it's worth changing the comments that way. > @@ -1319,7 +1330,7 @@ BufferAlloc(SMgrRelation smgr, char relp > * InvalidateBuffer -- mark a shared buffer invalid and return it to the > * freelist. > * > - * The buffer header spinlock must be held at entry. We drop it before > + * The buffer header lock must be held at entry. We drop it before > * returning. (This is sane because the caller must have locked the > * buffer in order to be sure it should be dropped.) > * Ok, by now I'm pretty sure that I don't want this to be changed everywhere, just makes reviewing harder. > @@ -1433,6 +1443,7 @@ void > MarkBufferDirty(Buffer buffer) > { > BufferDesc *bufHdr; > + uint32 state; > > if (!BufferIsValid(buffer)) > elog(ERROR, "bad buffer ID: %d", buffer); > @@ -1449,14 +1460,14 @@ MarkBufferDirty(Buffer buffer) > /* unfortunately we can't check if the lock is held exclusively */ > Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr))); > > - LockBufHdr(bufHdr); > + state = LockBufHdr(bufHdr); > > - Assert(bufHdr->refcount > 0); > + Assert(BUF_STATE_GET_REFCOUNT(state) > 0); > > /* >* If the buffer was not dirty already, do vacuum accounting. >*/ > - if (!(bufHdr->flags & BM_DIRTY)) > + if (!(state & BM_DIRTY)) > { > VacuumPageDirty++; > pgBufferUsage.shared_blks_dirtied++; > @@ -1464,9 +1475,9 @@ MarkBufferDirty(Buffer buffer) > VacuumCostBalance += VacuumCostPageDirty; > } > > - bufHdr->flags |= (BM_DIRTY | BM_JUST_DIRTIED); > - > - UnlockBufHdr(bufHdr); > + state |= BM_DIRTY | BM_JUST_DIRTIED; > + state &= ~BM_LOCKED; > + pg_atomic_write_u32(&bufHdr->state, state); > } Hm, this is a routine I think we should avoid taking the lo
Re: [HACKERS] WIP: Covering + unique indexes.
21.03.2016 19:53, Anastasia Lubennikova: 19.03.2016 08:00, Peter Geoghegan: On Fri, Mar 18, 2016 at 5:15 AM, David Steele wrote: It looks like this patch should be marked "needs review" and I have done so. Uh, no it shouldn't. I've posted an extensive review on the original design thread. See CF entry: https://commitfest.postgresql.org/9/433/ Marked "Waiting on Author". Thanks to David, I've missed these letters at first. I'll answer here. * You truncate (remove suffix attributes -- the "included" attributes) within _bt_insertonpg(): - right_item = CopyIndexTuple(item); + indnatts = IndexRelationGetNumberOfAttributes(rel); + indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel); + + if (indnatts != indnkeyatts) + { + right_item = index_reform_tuple(rel, item, indnatts, indnkeyatts); + right_item_sz = IndexTupleDSize(*right_item); + right_item_sz = MAXALIGN(right_item_sz); + } + else + right_item = CopyIndexTuple(item); ItemPointerSet(&(right_item->t_tid), rbkno, P_HIKEY); I suggest that you do this within _bt_insert_parent(), instead, iff the original target page is know to be a leaf page. That's where it needs to happen for conventional suffix truncation, which has special considerations when determining which attributes are safe to truncate (or even which byte in the first distinguishing attribute it is okay to truncate past) I agree that _bt_insertonpg() is not right for truncation. Furthermore, I've noticed that all internal keys are solely the copies of "High keys" from the leaf pages. Which is pretty logical. Therefore, if we have already truncated the tuple, when it became a High key, we do not need the same truncation within _bt_insert_parent() or any other function. So the only thing to worry about is the HighKey truncation. I rewrote the code. Now only _bt_split cares about truncation. It's a bit more complicated to add it into index creation algorithm. There's a trick with a "high key". /* * We copy the last item on the page into the new page, and then * rearrange the old page so that the 'last item' becomes its high key * rather than a true data item. There had better be at least two * items on the page already, else the page would be empty of useful * data. */ /* * Move 'last' into the high key position on opage */ To be consistent with other steps of algorithm ( all high keys must be truncated tuples), I had to update this high key on place: delete the old one, and insert truncated high key. The very same logic I use to truncate posting list of a compressed tuple in the "btree_compression" patch. [1] I hope, both patches will be accepted, and then I'll thoroughly merge them . * I think the comparison logic may have a bug. Does this work with amcheck? Maybe it works with bt_index_check(), but not bt_index_parent_check()? I think that you need to make sure that _bt_compare() knows about this, too. That's because it isn't good enough to let a truncated internal IndexTuple compare equal to a scankey when non-truncated attributes are equal. It is a very important issue. But I don't think it's a bug there. I've read amcheck sources thoroughly and found that the problem appears at "invariant_key_less_than_equal_nontarget_offset() static bool invariant_key_less_than_equal_nontarget_offset(BtreeCheckState *state, Page nontarget, ScanKey key, OffsetNumber upperbound) { int16natts = state->rel->rd_rel->relnatts; int32cmp; cmp = _bt_compare(state->rel, natts, key, nontarget, upperbound); return cmp <= 0; } It uses scankey, made with _bt_mkscankey() which uses only key attributes, but calls _bt_compare with wrong keysz. If we wiil use nkeyatts = state->rel->rd_index->relnatts; instead of natts, all the checks would be passed successfully. Same for invariant_key_greater_than_equal_offset() and invariant_key_less_than_equal_nontarget_offset(). In my view, it's the correct way to fix this problem, because the caller is responsible for passing proper arguments to the function. Of course I will add a check into bt_compare, but I'd rather make it an assertion (see the patch attached). I'll add a flag to distinguish regular and truncated tuples, but it will not be used in this patch. Please, comment, if I've missed something. As you've already mentioned, neither high keys, nor tuples on internal pages are using "itup->t_tid.ip_posid", so I'll take one bit of it. It will definitely require changes in the future works on suffix truncation or something like that, but IMHO for now it's enough. Do you have any objections or comments? [1] https://commitfest.postgresql.org/9/494/ One more version of the patch is attached. I did more testing, and fixed couple of bugs. Now, if any indexed colum
[HACKERS] Combinations of pg_strdup/free in pg_dump code
Hi all, While reading some code of pg_dump, I noticed that the following pattern is heavily present: lanname = pg_strdup(stuff) free(lanname); One example is for example that: lanname = get_language_name(fout, transforminfo[i].trflang); if (typeInfo && lanname) appendPQExpBuffer(&namebuf, "%s %s", typeInfo->dobj.name, lanname); transforminfo[i].dobj.name = namebuf.data; free(lanname); And get_language_name() uses pg_strdup() to allocate the string freed here. When pg_strdup or any pg-related allocation routines are called, I think that we should use pg_free() and not free(). It does not matter much in practice because pg_free() calls actually free() and the latter per the POSIX spec should do nothing if the input pointer is NULL (some version of SunOS that crash on that actually :p), but we really had better be consistent in the calls done. Thoughts? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data
On Mon, Mar 28, 2016 at 10:08 PM, Thomas Munro wrote: > On Tue, Mar 29, 2016 at 1:56 AM, Thomas Munro > wrote: >> On Mon, Mar 28, 2016 at 8:54 PM, Michael Paquier >> wrote: >>> I have been also thinking a lot about this patch, and the fact that >>> the WAL receiver latch is being used within the internals of >>> libpqwalreceiver has been bugging me a lot, because this makes the >>> wait phase happening within the libpqwalreceiver depend on something >>> that only the WAL receiver had a only control on up to now (among the >>> things thought: having a second latch for libpqwalreceiver, having an >>> event interface for libpqwalreceiver, switch libpq_receive into being >>> asynchronous...). >> >> Yeah, it bugs me too. Do you prefer this? >> >> int walrcv_receive(char **buffer, int *wait_fd); >> >> Return value -1 means end-of-copy as before, return value 0 means "no >> data available now, please call me again when *wait_fd is ready to >> read". Then walreceiver.c can look after the WaitLatchOrSocket call >> and deal with socket readiness, postmaster death, timeout and latch, >> and libpqwalreceiver.c doesn't know anything about all that stuff >> anymore, but it is now part of the interface that it must expose a >> file descriptor for readiness testing when it doesn't have data >> available. >> >> Please find attached a new patch series which does it that way. > > Oops, there is a bug in the primary disconnection case when len == 1 > and it breaks out of the loop and wait_fd is invalid. I'll follow up > on that tomorrow, but I'm interested to hear your thoughts (and anyone > else's!) on that interface change and general approach. I definitely prefer that, that's neater! libpq_select could be simplified because a timeout does not matter much. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data
On 2016-03-28 13:21:41 +0530, Pavan Deolasee wrote: > On Wed, Mar 23, 2016 at 6:16 PM, Michael Paquier > wrote: > > > > > > > I'd just add dots at the end of the sentences in the comment blocks > > because that's project style, but I'm being picky, except that the > > logic looks quite good. > > > > Since this is a bug affecting all stable branches, IMHO it will be a good > idea to fix this before the upcoming minor releases. It's definitely too late for that; they're going to be wrapped in a couple hours. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup tools ought to ensure created backups are durable
On 2016-03-28 11:35:57 +0200, Magnus Hagander wrote: > On Mon, Mar 28, 2016 at 3:11 AM, Michael Paquier > wrote: > > > On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund wrote: > > > As pointed out in > > > > > http://www.postgresql.org/message-id/20160327232509.v5wgac5vskuse...@awork2.anarazel.de > > > our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't > > > make any efforts to ensure their output is durable. > > > > > > I think for backup tools of possibly critical data, that's pretty much > > > unaceptable. > > > > Definitely agreed, once a backup/dump has been taken and those > > utilities exit, we had better ensure that they are durably on disk. > > For pg_basebackup and pg_dump, as everything except pg_dump/plain > > require a target directory for the location of the output result, we > > really can make things better. > > > > > Definitely agreed on fixing it. But I don't think your summary is right. > > pg_basebackup in tar mode can be sent to stdout, does not require a > directory. And the same for pg_dump in any mode except for directory. So we > can't just drive it off the mode, some more detailed checks are required. if (!isastty(stdout)) ought to do the trick, afaics? And maybe add a warning somewhere in the docs about the tools not fsyncing if you pipe their output data somewhere? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics
On 2016-03-28 11:48:46 +0530, Dilip Kumar wrote: > On Sun, Mar 27, 2016 at 5:48 PM, Andres Freund wrote: > > > > > What's sizeof(BufferDesc) after applying these patches? It should better > > be <= 64... > > > > It is 72. Ah yes, miscalculated the required alignment. Hm. So we got to get this smaller. I see three approaches: 1) Reduce the spinlock size on ppc. That actually might just work by replacing "unsigned int" by "unsigned char" 2) Replace the lwlock spinlock by a bit in LWLock->state. That'd avoid embedding the spinlock, and actually might allow to avoid one atomic op in a number of cases. 3) Shrink the size of BufferDesc by removing buf_id; that'd bring it to 64byte. I'm a bit hesitant to go for 3), because it'd likely end up adding a bit of arithmetic to a number of places in bufmgr.c. Robert, what do you think? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data
On Tue, Mar 29, 2016 at 1:56 AM, Thomas Munro wrote: > On Mon, Mar 28, 2016 at 8:54 PM, Michael Paquier > wrote: >> I have been also thinking a lot about this patch, and the fact that >> the WAL receiver latch is being used within the internals of >> libpqwalreceiver has been bugging me a lot, because this makes the >> wait phase happening within the libpqwalreceiver depend on something >> that only the WAL receiver had a only control on up to now (among the >> things thought: having a second latch for libpqwalreceiver, having an >> event interface for libpqwalreceiver, switch libpq_receive into being >> asynchronous...). > > Yeah, it bugs me too. Do you prefer this? > > int walrcv_receive(char **buffer, int *wait_fd); > > Return value -1 means end-of-copy as before, return value 0 means "no > data available now, please call me again when *wait_fd is ready to > read". Then walreceiver.c can look after the WaitLatchOrSocket call > and deal with socket readiness, postmaster death, timeout and latch, > and libpqwalreceiver.c doesn't know anything about all that stuff > anymore, but it is now part of the interface that it must expose a > file descriptor for readiness testing when it doesn't have data > available. > > Please find attached a new patch series which does it that way. Oops, there is a bug in the primary disconnection case when len == 1 and it breaks out of the loop and wait_fd is invalid. I'll follow up on that tomorrow, but I'm interested to hear your thoughts (and anyone else's!) on that interface change and general approach. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics
On Sun, Mar 27, 2016 at 4:31 PM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > On Sun, Mar 27, 2016 at 3:10 PM, Andres Freund wrote: > >> On 2016-03-27 12:38:25 +0300, Alexander Korotkov wrote: >> > On Sat, Mar 26, 2016 at 1:26 AM, Alexander Korotkov < >> > a.korot...@postgrespro.ru> wrote: >> > >> > > Thank you very much for testing! >> > > I also got access to 4 x 18 Intel server with 144 threads. I'm going >> to >> > > post results of tests on this server in next Monday. >> > > >> > >> > I've run pgbench tests on this machine: pgbench -s 1000 -c $clients -j >> 100 >> > -M prepared -T 300. >> > See results in the table and chart. >> > >> > clients master v3 v5 >> > 1 11671 12507 12679 >> > 2 24650 26005 25010 >> > 4 49631 48863 49811 >> > 8 96790 96441 99946 >> > 10 121275 119928 124100 >> > 20 243066 243365 246432 >> > 30 359616 342241 357310 >> > 40 431375 415310 441619 >> > 50 489991 489896 500590 >> > 60 538057 636473 554069 >> > 70 588659 714426 738535 >> > 80 405008 923039 902632 >> > 90 295443 1181247 1155918 >> > 100 258695 1323125 1325019 >> > 110 238842 1393767 1410274 >> > 120 226018 1432504 1474982 >> > 130 215102 1465459 1503241 >> > 140 206415 1470454 1505380 >> > 150 197850 1475479 1519908 >> > 160 190935 1420915 1484868 >> > 170 185835 1438965 1453128 >> > 180 182519 1416252 1453098 >> > >> > My conclusions are following: >> > 1) We don't observe any regression in v5 in comparison to master. >> > 2) v5 in most of cases slightly outperforms v3. >> >> What commit did you base these tests on? I guess something recent, after >> 98a64d0bd? >> > > Yes, more recent than 98a64d0bd. It was based on 676265eb7b. > > >> > I'm going to do some code cleanup of v5 in Monday >> >> Ok, I'll try to do a review and possibly commit after that. >> > > Sounds good. > There is next revision of patch. It contains mostly cosmetic changes. Comment are adjusted to reflect changes of behaviour. I also changed atomic AND/OR for local buffers to read/write pair which would be cheaper I suppose. However, I don't insist on it, and it could be reverted. The patch is ready for your review. It's especially interesting what do you think about the way I abstracted exponential back off of spinlock. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company pinunpin-cas-6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On Mon, Mar 28, 2016 at 3:02 PM, Dilip Kumar wrote: > 1. Relation Size : No change in size, its same as base and v13 > > 2. INSERT 1028 Byte 1000 tuple performance > --- > Client base v13 v15 > 1 117 124 122 > 2 111 126 123 > 4 51 128 125 > 8 43 149 135 > 16 40 217 147 > 32 35 263 141 > > 3. COPY 1 Tuple performance. > -- > Client base v13 v15 > 1 118 147 155 > 2 217 276 273 > 4 210 421 457 > 8 166 630 643 > 16 145 813 595 > 32 124 985 598 > > Conclusion: > --- > 1. I think v15 is solving the problem exist with v13 and performance is > significantly high compared to base, and relation size is also stable, So > IMHO V15 is winner over other solution, what other thinks ? > > 2. And no point in thinking that V13 is better than V15 because, V13 has > bug of sometime extending more than expected pages and that is uncontrolled > and same can be the reason also of v13 performing better. > Found one problem with V15, so sending the new version. In V15 I am taking prev_blkno as targetBlock instead it should be the last block of the relation at that time. Attaching new patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com multi_extend_v16.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Two division by 0 errors in optimizer/plan/planner.c and optimizer/path/costsize.c
On 2016-03-28 11:33, Aleksander Alekseev wrote: Hello, Piotr. Thanks for report. But I'm having some difficulties reproducing issues you described. Oh, if you want backtraces then either set a conditional breakpoint or add your own Assert like I did. Also it would be a good idea to include these steps to regression tests. I agree and I generally think that the more test cases touch previously not covered code paths the better, even if it had to be run as a different make(1) target. Although it seems that at least some people would agree (see [1]), the "make check" split somehow isn't happening. [1] ca+tgmoymofe94+3wg3spg9sqajkv4sixjey+rdrmamye-vq...@mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
Hi, Petr! On Thu, Mar 17, 2016 at 10:56 AM, Petr Jelinek wrote: > On 16/03/16 15:31, Teodor Sigaev wrote: > >> Good catch, thanks! Tests were added. >>> >> >> I don't see any objection, is consensus reached? I'm close to comiit >> that... >> >> > I did only cursory review on the bloom contrib module and don't really > have complaints there, but I know you can review that one. I'd like the > English of the generic_xlog.c description improved but I won't get to it > before weekend. What is your plans about English of the generic_xlog.c? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
[HACKERS] Autovacuum to prevent wraparound tries to consume xid
Hackers, one our customer meet near xid wraparound situation. xid counter reached xidStopLimit value. So, no transactions could be executed in normal mode. But what I noticed is strange behaviour of autovacuum to prevent wraparound. It vacuums tables, updates pg_class and pg_database, but then falls with "database is not accepting commands to avoid wraparound data loss in database" message. We end up with situation that according to pg_database maximum age of database was less than 200 mln., but transactions couldn't be executed, because ShmemVariableCache wasn't updated (checked by gdb). I've reproduced this situation on my laptop as following: 1) Connect gdb, do "set ShmemVariableCache->nextXid = ShmemVariableCache->xidStopLimit" 2) Stop postgres 3) Make some fake clog: "dd bs=1m if=/dev/zero of=/usr/local/pgsql/data/pg_clog/07FF count=1024" 4) Start postgres Then I found the same situation as in customer database. Autovacuum to prevent wraparound regularly produced following messages in the log: ERROR: database is not accepting commands to avoid wraparound data loss in database "template1" HINT: Stop the postmaster and vacuum that database in single-user mode. You might also need to commit or roll back old prepared transactions. Finally all databases was frozen # SELECT datname, age(datfrozenxid) FROM pg_database; datname │ age ───┼── template1 │0 template0 │0 postgres │ 5000 (3 rows) but no transactions could be executed (ShmemVariableCache wasn't updated). After some debugging I found that vac_truncate_clog consumes xid just to produce warning. I wrote simple patch which replaces GetCurrentTransactionId() with ShmemVariableCache->nextXid. That completely fixes this situation for me: ShmemVariableCache was successfully updated. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company fix_vac_truncate_clog_xid_consume.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
Oleg Bartunov-2 wrote > But still, icu provides us abbreviated keys and collation stability, Does include ICU mean that collation handling is identical across platforms? E.g. a query on Linux involving string comparison would yield the same result on MacOS and Windows? If that is the case I'm all for it. Currently the different behaviour in handling collation aware string comparisons is a bug in my eyes from a user's perspective. I do understand and can accept the technical reasons for that, but it still feels odd that a query yields different results (with identical data) just because it runs on a different platform. -- View this message in context: http://postgresql.nabble.com/Draft-release-notes-for-next-week-s-releases-tp5895357p5895484.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
On 2016/03/28 18:17, Rajkumar Raghuwanshi wrote: I am testing postgres_fdw join pushdown feature for PostgreSQL 9.6 DB, and I observed below issue._ Observation:_ Inner join and full outer join combination on a table generating wrong result. SELECT * FROM lt; c1 1 2 (2 rows) SELECT * FROM ft; c1 1 2 (2 rows) \d+ ft Foreign table "public.ft" Column | Type | Modifiers | FDW Options | Storage | Stats target | Description +-+---+-+-+--+- c1 | integer | | | plain | | Server: link_server FDW Options: (table_name 'lt') --inner join and full outer join on local tables SELECT t1.c1,t2.c1,t3.c1 FROM lt t1 INNER JOIN lt t2 ON (t1.c1 = t2.c1) FULL JOIN lt t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 2 | 2 | 2 (2 rows) --inner join and full outer join on corresponding foreign tables SELECT t1.c1,t2.c1,t3.c1 FROM ft t1 INNER JOIN ft t2 ON (t1.c1 = t2.c1) FULL JOIN ft t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 1 | 2 | 2 | 1 | 2 | 2 | 2 (4 rows) I think the reason for that is in foreign_join_ok. This in that function: switch (jointype) { case JOIN_INNER: fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_i->remote_conds)); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_o->remote_conds)); break; case JOIN_LEFT: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_i->remote_conds)); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_o->remote_conds)); break; case JOIN_RIGHT: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_o->remote_conds)); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_i->remote_conds)); break; case JOIN_FULL: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_i->remote_conds)); fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_o->remote_conds)); break; default: /* Should not happen, we have just check this above */ elog(ERROR, "unsupported join type %d", jointype); } wrongly pulls up remote_conds from joining relations in the FULL JOIN case. I think we should not pull up such conditions in the FULL JOIN case. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A question on systable_beginscan()
Hi, On 2016/03/25 23:49, Onder Kalaci wrote: > Hi hackers, > > As it's documented in the source code, systable_beginscan() could be used > to on non-system tables as well. My question is that, is it possible to > write a C code with systable_beginscan(), systable_getnext() and ScanKeys > which is equivalent to the following query: (Assume that the qual_column is > a text column) > > SELECT id FROM table WHERE qual_column::int = 15; > > In other words, can we cast a column type to another type via these > low-level APIs? Are there any examples in the Postgres source? Don't think the API advertises any support for that or at least I couldn't find any. However, I am not sure if it's outright impossible to achieve that with some prep - initialize the scan key (sk_func) with FmgrInfo of a specialized function, that performs the cast before comparing. Such a function would be written to work with fixed pair of source and target types to be able to invoke, say, textout() -> int4in(), before comparing the value with the argument (sk_argument). The argument would have to be of the target type, needless to say . Most likely, this would be an unrecommended course of action in the long run, so you would be better off considering other ways (like SPI). Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alter or rename enum value
> I do not know whether this would be a meaningful improvement for > common use-cases, though. (It'd help if people were more specific > about the use-cases they need to work.) For what its worth, in the company I am working for, InnoGames GmbH, not being able to alter enums is the number one pain point with PostgreSQL. We have hundreds of small databases on the backend of the game worlds. They are heavily using enums. New values to the enums need to be added or to be removed for the new features. We are relying on the transactions for the database migrations, so we couldn't make use of ALTER TYPE ADD VALUE. Some functions found on the Internet [1] which change the values on the catalog had been used, until they corrupted some indexes. (I believe those functions are still quite common.) Now, we are using a function to replace an enum type on all tables with another one, but we are not at all happy with this solution. It requires all objects which were using the enum to be dropped and recreated, and it rewrites the tables, so it greatly increases the migration time and effort. [1] http://en.dklab.ru/lib/dklab_postgresql_enum/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
On Mon, Mar 28, 2016 at 5:50 PM, Kyotaro HORIGUCHI wrote: > Thank you for the new patch. Sorry to have overlooked some > versions. I'm looking the v19 patch now. > > make complains for an unused variable. > > | syncrep.c: In function ‘SyncRepGetSyncStandbys’: > | syncrep.c:601:13: warning: variable ‘next’ set but not used > [-Wunused-but-set-variable] > |ListCell *next; > > > At Thu, 24 Mar 2016 22:29:01 +0900, Masahiko Sawada > wrote in >> >> > SyncRepInitConfig()->SyncRepFreeConfig() has already pfree'd that >> >> > in the patch. Or am I missing something? >> > >> > Sorry, instead, the memory from strdup() will be abandoned in >> > upper level. (Thinking for some time..) Ah, I found that the >> > problem should be here. >> > >> > > SyncRepFreeConfig(SyncRepConfigData *config) >> > > { >> > ... >> > !> list_free(config->members); >> > > pfree(config); >> > > } >> > >> > The list_free *doesn't* free the memory blocks pointed by >> > lfirst(cell), which has been pstrdup'ed. It should be >> > list_free_deep(config->members) instead to free it completely. > > Fujii> Yep, but SyncRepFreeConfig() already uses list_free_deep > Fujii> in the latest patch. Could you read the latest version > Fujii> that I posted upthread. > > Sorry for overlooked the version. Every pair of parse(or > SyncRepUpdateConfig) and SyncRepFreeConfig is on the same memory > context so it seems safe (but might be fragile since it relies on > that the caller does so.). > >> >> Previous(9.5 or before) s_s_names also accepts non-ASCII character and >> >> non-printable character, and can show it without replacing these >> >> character to '?'. >> > >> > Thank you for pointint it out (it was completely out of my >> > mind..). I have no objection to keep the previous behavior. >> > >> >> From backward compatibility perspective, we should not choose #1 or #2. >> >> Different behaviour between previous and current s_s_names is that >> >> previous s_s_names doesn't accept the node name having the sort of >> >> white-space character that isspace() returns true with. >> >> But current s_s_names allows us to specify such a node name. >> >> I guess that changing such behaviour is enough for fixing this issue. >> >> Thoughts? >> > >> >> Attached latest patch incorporating all review comments so far. >> >> Aside from the review comments, I did following changes; >> - Add logic to avoid fatal exit in yy_fatal_error(). > > Maybe good catch, but.. > >> syncrep_scanstr(const char *str) > .. >> * Regain control after a fatal, internal flex error. It may have >> * corrupted parser state. Consequently, abandon the file, but trust > >> * that the state remains sane enough for syncrep_yy_delete_buffer(). > > > guc-file.l actually abandones the config file but syncrep_scanner > reads only a value of an item in it. And, the latter is > eventually true but a bit hard to understand. > > The patch will emit a mysterious error message like this. > >> invalid value for parameter "synchronous_standby_names": "2[a,b,c]" >> configuration file ".../postgresql.conf" contains errors > > This is utterly wrong. A bit related to that, it seems to me that > syncrep_scan.l doesn't need the same mechanism with > guc-file.l. The nature of the modification would be making > call_*_check_hook to be tri-state instead of boolean. So just > cathing errors in call_*_check_hook and ereport()'ing as SQL > parser does seems enough, but either will do for me. Well, I think that call_*_check_hook can not catch such a fatal error. Because if yy_fatal_error() is called without preventing logic when reloading configuration file, postmaster process will abnormal exit immediately as well as wal sender process. > >> - Improve regression test cases. > > I forgot to mention that, but additionalORDER BY makes the test > robust. > > I doubt the validity of the behavior in the following test. > >> # Change the synchronous_standby_names = '2[standby1,*,standby2]' and check >> sync_state > > Is this regarded as a correct as a value for it? Since previous s_s_names (9.5 or before) can accept this value, I didn't change behaviour. And I added this test case for checking backward compatibility more finely. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup tools ought to ensure created backups are durable
On Mon, Mar 28, 2016 at 3:11 AM, Michael Paquier wrote: > On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund wrote: > > As pointed out in > > > http://www.postgresql.org/message-id/20160327232509.v5wgac5vskuse...@awork2.anarazel.de > > our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't > > make any efforts to ensure their output is durable. > > > > I think for backup tools of possibly critical data, that's pretty much > > unaceptable. > > Definitely agreed, once a backup/dump has been taken and those > utilities exit, we had better ensure that they are durably on disk. > For pg_basebackup and pg_dump, as everything except pg_dump/plain > require a target directory for the location of the output result, we > really can make things better. > > Definitely agreed on fixing it. But I don't think your summary is right. pg_basebackup in tar mode can be sent to stdout, does not require a directory. And the same for pg_dump in any mode except for directory. So we can't just drive it off the mode, some more detailed checks are required. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Two division by 0 errors in optimizer/plan/planner.c and optimizer/path/costsize.c
Hello, Piotr. Thanks for report. But I'm having some difficulties reproducing issues you described. I compiled PostgreSQL from master branch on FreeBSD 10.2 using this command: ``` CC=/usr/local/bin/gcc49 CFLAGS="-O0 -g" \ ./configure --enable-cassert --enable-debug \ --prefix=/home/eax/postgresql-install \ && gmake clean && gmake -j2 -s ``` Then I run reinit.sh: ``` #!/usr/bin/env bash P=~/postgresql-install pkill -9 postgres make install rm -rf $P/data $P/bin/initdb -D $P/data echo "max_prepared_transactions = 100" >> $P/data/postgresql.conf echo "wal_level = hot_standby" >> $P/data/postgresql.conf echo "wal_keep_segments = 128" >> $P/data/postgresql.conf echo "max_connections = 10" >> $P/data/postgresql.conf echo "listen_addresses = '*'" >> $P/data/postgresql.conf echo '' > $P/data/logfile echo "host all all 0.0.0.0/0 trust" >> $P/data/pg_hba.conf echo "host replication all 0.0.0.0/0 trust" >> $P/data/pg_hba.conf echo "local replication all trust" >> $P/data/pg_hba.conf $P/bin/pg_ctl -w -D $P/data -l $P/data/logfile start $P/bin/createdb `whoami` $P/bin/psql -c "create table test(k int primary key, v text);" ``` ..., connected to PostgreSQL using psql, in second terminal I attached to the backend process using gdb710 and input `c`. Now in psql: ``` eax=# create table tt5(x int); CREATE TABLE eax=# create table b_star(x int); CREATE TABLE eax=# insert into b_star values (1), (2), (3); INSERT 0 3 eax=# insert into tt5 values (2), (3), (4), (5); INSERT 0 4 eax=# select 1 eax-# from public.tt5 as subq_0 eax-# where EXISTS ( eax(#select 1 eax(#from public.b_star as ref_0 eax(#where false eax(# ); ?column? -- (0 rows) eax=# select 1 eax-# from unnest('{}'::boolean[]) a (x) eax-# left join ( eax(#select * eax(#from unnest('{}'::boolean[]) eax(#where false eax(# ) b (x) on a.x = b.x; ?column? -- (0 rows) ``` Everything seems to work, no stacktraces in gdb. Could you please provide more concrete steps to reproduce these issues i.e, OS and compiler version, compilation flags (or package version), cluster config, database schema, etc? These steps are required at least to make sure that fixed code really fixes a problem. Also it would be a good idea to include these steps to regression tests. -- Best regards, Aleksander Alekseev http://eax.me/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation extension scalability
On Mon, Mar 28, 2016 at 7:21 AM, Dilip Kumar wrote: > I agree with that conclusion. I'm not quite sure where that leaves >> us, though. We can go back to v13, but why isn't that producing extra >> pages? It seems like it should: whenever a bulk extend rolls over to >> a new FSM page, concurrent backends will search either the old or the >> new one but not both. >> > Our open question was why V13 is not producing extra pages, I tried to print some logs and debug. It seems to me that, when blocks are spilling to next FSM pages, that time all backends who are waiting on lock will not get the block because searching in old FSM page. But the backend which is extending the pages will set RelationSetTargetBlock to latest blocks, and that will make new FSM page available for search by new requesters. 1. So this is why v13 (in normal cases*1) not producing unused pages. 2. But it will produce extra pages (which will be consumed by new requesters), because all waiter will come one by one and extend 512 pages. *1 : Above I have mentioned normal case, I mean there is some case exist where V13 can leave unused page, Like one by one each waiter Get the lock and extend the page, but no one go down till RelationSetTargetBlock so till now new pages are not available by new requester, and time will come when blocks will spill to third FSM page, now one by one all backends go down and set RelationSetTargetBlock, and suppose last one set it to the block which is in 3rd FSM page, in this case, pages in second FSM pages are unused. Maybe we could do this - not sure if it's what you were suggesting above: >> >> 1. Add the pages one at a time, and do RecordPageWithFreeSpace after each >> one. >> 2. After inserting them all, go back and update the upper levels of >> the FSM tree up the root. >> > I think this is better option, Since we will search last two pages of FSM > tree, then no need to update the upper level of the FSM tree. Right ? > > I will test and post the result with this option. > I have created this patch and results are as below. * All test scripts are same attached upthread 1. Relation Size : No change in size, its same as base and v13 2. INSERT 1028 Byte 1000 tuple performance --- Client base v13 v15 1 117 124 122 2 111 126 123 4 51 128 125 8 43 149 135 16 40 217 147 32 35 263 141 3. COPY 1 Tuple performance. -- Client base v13 v15 1 118 147 155 2 217 276 273 4 210 421 457 8 166 630 643 16 145 813 595 32 124 985 598 Conclusion: --- 1. I think v15 is solving the problem exist with v13 and performance is significantly high compared to base, and relation size is also stable, So IMHO V15 is winner over other solution, what other thinks ? 2. And no point in thinking that V13 is better than V15 because, V13 has bug of sometime extending more than expected pages and that is uncontrolled and same can be the reason also of v13 performing better. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com multi_extend_v15.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
On Mon, Mar 28, 2016 at 2:06 PM, Peter Geoghegan wrote: > On Mon, Mar 28, 2016 at 12:55 AM, Oleg Bartunov > wrote: > > We'll post the patch. > > Cool. > > > Teodor made something to get abbreviated keys work as > > I remember. I should say, that 27x improvement I got on my macbook. I > will > > check on linux. > > I think that Linux will be much faster. The stxfrm() blob produced by > Mac OSX will have a horribly low concentration of entropy. For an 8 > byte Datum, you get only 2 distinguishing bytes. It's really, really > bad. Mac OSX probably makes very little use of strxfrm() in practice; > there are proprietary APIs that do something similar, but all using > UTF-16 only. > Yes, Linux is much-much faster, I see no difference in performance using latest icu 57_1. I tested on Ubuntu 14.4.04. But still, icu provides us abbreviated keys and collation stability, so let's add --with-icu. > > -- > Peter Geoghegan >
Re: [HACKERS] Support for N synchronous standby servers - take 2
On 2016/03/28 17:50, Kyotaro HORIGUCHI wrote: > > # LISPers don't hesitate to dive into Sea of Parens. Sorry in advance to be off-topic: https://xkcd.com/297 :) Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
Hi, I am testing postgres_fdw join pushdown feature for PostgreSQL 9.6 DB, and I observed below issue. *Observation:* Inner join and full outer join combination on a table generating wrong result. SELECT * FROM lt; c1 1 2 (2 rows) SELECT * FROM ft; c1 1 2 (2 rows) \d+ ft Foreign table "public.ft" Column | Type | Modifiers | FDW Options | Storage | Stats target | Description +-+---+-+-+--+- c1 | integer | | | plain | | Server: link_server FDW Options: (table_name 'lt') --inner join and full outer join on local tables SELECT t1.c1,t2.c1,t3.c1 FROM lt t1 INNER JOIN lt t2 ON (t1.c1 = t2.c1) FULL JOIN lt t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 2 | 2 | 2 (2 rows) --inner join and full outer join on corresponding foreign tables SELECT t1.c1,t2.c1,t3.c1 FROM ft t1 INNER JOIN ft t2 ON (t1.c1 = t2.c1) FULL JOIN ft t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 1 | 2 | 2 | 1 | 2 | 2 | 2 (4 rows) Thanks & Regards, Rajkumar Raghuwanshi QMG, EnterpriseDB Corporation >
Re: [HACKERS] [PATCH] Phrase search ported to 9.6
Hi! On Sat, Mar 26, 2016 at 12:02 AM, David Steele wrote: > On 3/25/16 3:54 PM, Artur Zakirov wrote: > > On 25.03.2016 21:42, Dmitry Ivanov wrote: > >> Sorry for the delay, I desperately needed some time to finish a bunch of > >> dangling tasks. > >> > >> I've added some new comments and clarified the ones that were obscure. > >> Moreover, I felt an urge to recheck most parts of the code since > >> apparently > >> nobody (besides myself) has gone so far yet. > >> > >> On 25.03.16 18:42 MSK, David Steele wrote: > >>> Time is short and it's not encouraging that you say there is "still > much > >> work to be done". > >> > >> Indeed, I was inaccurate. I am more than interested in the positive > >> outcome. > >> > > > > I tested the patch and take a look on it. All regression tests passed. > > The code looks good and the patch introduce a great functionality. > > > > I think the patch can be marked as "Ready for Commiter". But I do not > > feel the force to do that myself. > > > > Also I agree with you about tsvector_setweight(). There is not a problem > > with it because this weights are immutable and so there is not benefits > > from new function. > > Ideally Alexander can also look at it. If not, then you should mark it > "ready for committer" since I doubt any more reviewers will sign on this > late in the CF. > I've checked the last version of the patch. Patch applies cleanly on master, builds without warnings, regression tests pass. The issue I reported about tsquery textual representation is fixed. I'm going to mark this "Ready for Committer". -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Support for N synchronous standby servers - take 2
Thank you for the new patch. Sorry to have overlooked some versions. I'm looking the v19 patch now. make complains for an unused variable. | syncrep.c: In function ‘SyncRepGetSyncStandbys’: | syncrep.c:601:13: warning: variable ‘next’ set but not used [-Wunused-but-set-variable] |ListCell *next; At Thu, 24 Mar 2016 22:29:01 +0900, Masahiko Sawada wrote in > >> > SyncRepInitConfig()->SyncRepFreeConfig() has already pfree'd that > >> > in the patch. Or am I missing something? > > > > Sorry, instead, the memory from strdup() will be abandoned in > > upper level. (Thinking for some time..) Ah, I found that the > > problem should be here. > > > > > SyncRepFreeConfig(SyncRepConfigData *config) > > > { > > ... > > !> list_free(config->members); > > > pfree(config); > > > } > > > > The list_free *doesn't* free the memory blocks pointed by > > lfirst(cell), which has been pstrdup'ed. It should be > > list_free_deep(config->members) instead to free it completely. Fujii> Yep, but SyncRepFreeConfig() already uses list_free_deep Fujii> in the latest patch. Could you read the latest version Fujii> that I posted upthread. Sorry for overlooked the version. Every pair of parse(or SyncRepUpdateConfig) and SyncRepFreeConfig is on the same memory context so it seems safe (but might be fragile since it relies on that the caller does so.). > >> Previous(9.5 or before) s_s_names also accepts non-ASCII character and > >> non-printable character, and can show it without replacing these > >> character to '?'. > > > > Thank you for pointint it out (it was completely out of my > > mind..). I have no objection to keep the previous behavior. > > > >> From backward compatibility perspective, we should not choose #1 or #2. > >> Different behaviour between previous and current s_s_names is that > >> previous s_s_names doesn't accept the node name having the sort of > >> white-space character that isspace() returns true with. > >> But current s_s_names allows us to specify such a node name. > >> I guess that changing such behaviour is enough for fixing this issue. > >> Thoughts? > > > > Attached latest patch incorporating all review comments so far. > > Aside from the review comments, I did following changes; > - Add logic to avoid fatal exit in yy_fatal_error(). Maybe good catch, but.. > syncrep_scanstr(const char *str) .. > * Regain control after a fatal, internal flex error. It may have > * corrupted parser state. Consequently, abandon the file, but trust > * that the state remains sane enough for syncrep_yy_delete_buffer(). guc-file.l actually abandones the config file but syncrep_scanner reads only a value of an item in it. And, the latter is eventually true but a bit hard to understand. The patch will emit a mysterious error message like this. > invalid value for parameter "synchronous_standby_names": "2[a,b,c]" > configuration file ".../postgresql.conf" contains errors This is utterly wrong. A bit related to that, it seems to me that syncrep_scan.l doesn't need the same mechanism with guc-file.l. The nature of the modification would be making call_*_check_hook to be tri-state instead of boolean. So just cathing errors in call_*_check_hook and ereport()'ing as SQL parser does seems enough, but either will do for me. > - Improve regression test cases. I forgot to mention that, but additionalORDER BY makes the test robust. I doubt the validity of the behavior in the following test. > # Change the synchronous_standby_names = '2[standby1,*,standby2]' and check > sync_state Is this regarded as a correct as a value for it? > Also I felt a sense of discomfort regarding using [ and ] as a special > character for priority method. > Because (, ) and [, ] are a little similar each other, so it would > easily make many syntax errors when nested style is supported. > And the synopsis of that in documentation is odd; > synchronous_standby_names = 'N [ node_name [, ...] ]' > > This topic has been already discussed before but, we might want to > change it to other characters such as < and >? I don't mind ether but as Robert said, it is true that the characters essentially to be used to enclose something should be preferred to other characters. Distinguishability of glyphs has less signinficance, perhaps. # LISPers don't hesitate to dive into Sea of Parens. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics v14
On 03/26/2016 08:09 PM, Alvaro Herrera wrote: Tomas Vondra wrote: There are a few places where I reverted the pgindent formatting, because it seemed a bit too weird - the first one are the lists of function prototypes in common.h/mvstat.h, the second one are function calls to _greedy/_exhaustive methods. Function prototypes being weird is something that we've learned to accept. There's no point in undoing pgindent decisions there, because the next run will re-apply them anyway. Best not to fight it. What you should definitely look into fixing is the formatting of comments, if the result is too horrible. You can prevent it from messing those by adding dashes /*- at the beginning of the comment. Yep, formatting of some of the comments got slightly broken, but it wasn't difficult to fix that without the /*--- trick. I'm not sure about the prototypes though. It was a bit weird because prototypes in the same header file were formatted very differently. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics v14
Hi, On 03/26/2016 10:18 AM, Tatsuo Ishii wrote: Fair point. Attached is v18 of the patch, after pgindent cleanup. Here are some feedbacks to v18 patch. 1) regarding examples in create_statistics manual Here are numbers I got. "with statistics" referrers to the case where multivariate statistics are used. "without statistics" referrers to the case where multivariate statistics are not used. The numbers denote estimated_rows/actual_rows. Thus closer to 1.0 is better. Some numbers are shown as a fraction to avoid 0 division. In my understanding case 1, 3, 4 showed that multivariate statistics superior. with statistics without statistics case1 0.980.01 case2 98/01/0 The case2 shows that functional dependencies assume that the conditions used in queries won't be incompatible - that's something this type of statistics can't fix. case3 1.050.01 case4 1/0 103/0 case5 18.50 18.33 case6 23/0123/0 The last two lines (case5 + case6) seem a bit suspicious. I believe those are for the histogram data, and I do get these numbers: case50.93 (5517 / 5949) 42.0 (249943 / 5949) case6100/0 100/0 Perhaps you've been using the version before the bugfix, with ANALYZE on the wrong table? 2) following comments by me are not addressed in the v18 patch. - There's no docs for pg_mv_statistic (should be added to "49. System Catalogs") - The word "multivariate statistics" or something like that should appear in the index. - There are some explanation how to deal with multivariate statistics in "14.1 Using Explain" and "14.2 Statistics used by the Planner" section. Yes, those are valid omissions. I plan to address them, and I'd also considering adding a section to 65.1 (How the Planner Uses Statistics), explaining more thoroughly how the planner uses multivariate stats. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
On Mon, Mar 28, 2016 at 12:55 AM, Oleg Bartunov wrote: > We'll post the patch. Cool. > Teodor made something to get abbreviated keys work as > I remember. I should say, that 27x improvement I got on my macbook. I will > check on linux. I think that Linux will be much faster. The stxfrm() blob produced by Mac OSX will have a horribly low concentration of entropy. For an 8 byte Datum, you get only 2 distinguishing bytes. It's really, really bad. Mac OSX probably makes very little use of strxfrm() in practice; there are proprietary APIs that do something similar, but all using UTF-16 only. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
On Mon, Mar 28, 2016 at 1:21 PM, Peter Geoghegan wrote: > On Mon, Mar 28, 2016 at 12:08 AM, Oleg Bartunov > wrote: > > Should we start thinking about ICU ? I compare Postgres with ICU and > without > > and found 27x improvement in btree index creation for russian strings. > This > > includes effect of abbreviated keys and ICU itself. Also, we'll get > system > > independent locale. > > I think we should. I want to develop a detailed proposal before > talking about it more, though, because the idea is controversial. > > Did you use the FreeBSD ports patch? Do you have your own patch that > you could share? > We'll post the patch. Teodor made something to get abbreviated keys work as I remember. I should say, that 27x improvement I got on my macbook. I will check on linux. > > I'm not surprised that ICU is so much faster, especially now that > UTF-8 is not a second class citizen (it's been possible to build ICU > to specialize all its routines to handle UTF-8 for years now). As you > may know, ICU supports partial sort keys, and sort key compression, > which may have also helped: > http://userguide.icu-project.org/collation/architecture > > > That page also describes how binary sort keys are versioned, which > allows them to be stored on disk. It says "A common example is the use > of keys to build indexes in databases". We'd be crazy to trust Glibc > strxfrm() to be stable *on disk*, but ICU already cares deeply about > the things we need to care about, because it's used by other database > systems like DB2, Firebird, and in some configurations SQLite [1]. > > Glibc strxfrm() is not great with codepoints from the Cyrillic > alphabet -- it seems to store 2 bytes per code-point in the primary > weight level. So ICU might also do better in your test case for that > reason. > Yes, I see on this page, that ICU is ~3 times faster for russian text. http://site.icu-project.org/charts/collation-icu4c48-glibc > > [1] > https://www.sqlite.org/src/artifact?ci=trunk&filename=ext/icu/README.txt > -- > Peter Geoghegan >
Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data
On Sun, Mar 27, 2016 at 7:30 AM, Thomas Munro wrote: > On Sat, Mar 26, 2016 at 2:48 AM, Michael Paquier > wrote: >> Should we worried about potential backward-incompatibilities with the >> new return values of walrcv_receive? > > There are three changes to the walrcv_receive interface: > > 1. It now takes a latch pointer, which may be NULL. > > 2. If the latch pointer is non-NULL, the existing function might > return a new sentinel value WALRCV_RECEIVE_LATCH_SET. (The > pre-existing sentinel value -1 is still in use and has the same value > and meaning as before, but it now has a name: > WALRCV_RECEIVE_COPY_ENDED.) > > 3. It will no longer return when the process is signalled (a latch > should be used to ask it to return instead). > > Effectively, any client code would need to change at least to add NULL > or possibly a latch if it needs to ask it to return, and any > alternative implementation of the WAL receiver interface would need to > use WaitEventSet (or WaitLatchOrSocket) as its event loop instead of > whatever it might be using now so that it can detect a latch's state. > But in any case, any such code would fail to compile against 9.6 due > to the new argument, and then you'd only be able to get the new return > value if you asked for it by passing in a latch. What affected code > are we aware of -- either users of libpqwalreceiver.so or other WAL > receiver implementations? Any negative value returned by walrcv_receive would have stopped the replication stream, not only -1. And as you say, it's actually a good thing that the interface of walrcv_receive is changing with this patch, this way compilation would just fail and failures would not happen discreetly. That's too low-level to be mentioned in the release notes either way, so just a compilation failure is acceptable to me. >> Do you have numbers to share regarding how is performing the >> latch-based approach and the approach that used SIGUSR2 when >> remote_apply is used? > > I couldn't measure any significant change (Linux, all on localhost, 128 > cores): > > pgbench -c 1 -N bench2 -T 600 > > 0001-remote-apply-v5.patch (signals), remote_apply -> 449 TPS > 0001-remote-apply-v6.patch (latches), remote_apply -> 452 TPS > > pgbench -c 64 -j 32 -N bench2 -T 600 > > 0001-remote-apply-v5.patch (signals), remote_apply -> 8536 TPS > 0001-remote-apply-v6.patch (latches), remote_apply -> 8534 TPS Which concludes that both imply the same kind of performance. We are likely seeing noise. > Incidentally, I also did some testing on what happens when you signal > a process that is busily writing and fsyncing. I tested a few > different kernels, write sizes and disk latencies and saw that things > were fine on all of them up to 10k signals/sec but after that some > systems' fsync performance started to reduce. Only Linux on Power was > still happily fsyncing at around 3/4 of full speed when signalled with > a 2MHz+ tight kill loop (!), while FreeBSD and Linux on Intel weren't > able to make much progress at all under such adversity. So I suppose > that if you could get the commit rate up into 5 figures you might be > able to measure an improvement for the latch version due to > latch-collapsing, though I noticed a large amount of signal-collapsing > going on at the OS level on all systems I tested anyway, so maybe it > wouldn't make a difference. I attach that test program for interest. Interesting results. > Also, I updated the comment for the declaration of the latch in > walreceiver.h to say something about the new usage. > > New patch series attached. static void WalRcvQuickDieHandler(SIGNAL_ARGS); - static void ProcessWalRcvInterrupts(void) Noise here. + ret = WaitLatchOrSocket(latch, events, PQsocket(streamConn), timeout_ms); + + if (ret & WL_POSTMASTER_DEATH) + exit(0); Exiting within libpqwalreceiver.so is no good. I think that even in the case of a postmaster cleanup we should allow things to be cleaned up. /* + * Wake up the walreceiver if it happens to be blocked in walrcv_receive, + * and tell it that a commit record has been applied. + * + * This is called by the startup process whenever interesting xlog records + * are applied, so that walreceiver can check if it needs to send an apply + * notification back to the master which may be waiting in a COMMIT with + * synchronous_commit = remote_apply. + */ +void +WalRcvWakeup(void) +{ + SetLatch(&WalRcv->latch); +} I think here that it would be good to add an assertion telling that this can just be called by the startup process while in recovery, WalRcv->latch is not protected by a mutex lock. +maximum of 'timeout' ms. If a message was successfully read, returns +its length. Otherwise returns 0 for timeout, WALRCV_RECEIVE_COPY_ENDED +for disconnection or WALRCV_RECEIVE_LATCH_SET. On success, a pointer Having an assigned constant name for timeout would be good for consistency with the rest. I have been also thinking a lot about this patch, and the fact that the
Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data
On Wed, Mar 23, 2016 at 6:16 PM, Michael Paquier wrote: > > > I'd just add dots at the end of the sentences in the comment blocks > because that's project style, but I'm being picky, except that the > logic looks quite good. > Since this is a bug affecting all stable branches, IMHO it will be a good idea to fix this before the upcoming minor releases. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[HACKERS] pgbench - remove unused clientDone parameter
Remove pgbench clientDone unused "ok" parameter. I cannot get the point of keeping a useless parameter, which is probably there because at some point in the past it was used. If it is needed some day it can always be reinserted. -- Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 4196b0e..1146e16 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -1314,10 +1314,8 @@ preparedStatementName(char *buffer, int file, int state) } static bool -clientDone(CState *st, bool ok) +clientDone(CState *st) { - (void) ok; /* unused */ - if (st->con != NULL) { PQfinish(st->con); @@ -1388,7 +1386,7 @@ top: /* stop client if next transaction is beyond pgbench end of execution */ if (duration > 0 && st->txn_scheduled > end_time) - return clientDone(st, true); + return clientDone(st); /* * If this --latency-limit is used, and this slot is already late so @@ -1441,7 +1439,7 @@ top: if (!PQconsumeInput(st->con)) { /* there's something wrong */ fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->state); -return clientDone(st, false); +return clientDone(st); } if (PQisBusy(st->con)) return true; /* don't have the whole result yet */ @@ -1488,7 +1486,7 @@ top: fprintf(stderr, "client %d aborted in state %d: %s", st->id, st->state, PQerrorMessage(st->con)); PQclear(res); - return clientDone(st, false); + return clientDone(st); } PQclear(res); discard_response(st); @@ -1504,7 +1502,7 @@ top: ++st->cnt; if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded) -return clientDone(st, true); /* exit success */ +return clientDone(st); /* exit success */ } /* increment state counter */ @@ -1541,7 +1539,7 @@ top: { fprintf(stderr, "client %d aborted while establishing connection\n", st->id); - return clientDone(st, false); + return clientDone(st); } INSTR_TIME_SET_CURRENT(end); INSTR_TIME_ACCUM_DIFF(thread->conn_time, end, start); @@ -1873,7 +1871,7 @@ top: bool ret = runShellCommand(st, argv[1], argv + 2, argc - 2); if (timer_exceeded) /* timeout */ -return clientDone(st, true); +return clientDone(st); else if (!ret) /* on error */ { st->ecnt++; @@ -1887,7 +1885,7 @@ top: bool ret = runShellCommand(st, NULL, argv + 1, argc - 1); if (timer_exceeded) /* timeout */ -return clientDone(st, true); +return clientDone(st); else if (!ret) /* on error */ { st->ecnt++; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgbench - show weight percent
This minor patch shows the expected drawing percent in multi-script reports, next to the relative weight. -- Fabiendiff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 4196b0e..3b63d69 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3080,10 +3080,10 @@ printResults(TState *threads, StatsData *total, instr_time total_time, { if (num_scripts > 1) printf("SQL script %d: %s\n" - " - weight = %d\n" + " - weight = %d (targets %.1f%% of total)\n" " - " INT64_FORMAT " transactions (%.1f%% of total, tps = %f)\n", i + 1, sql_script[i].desc, - sql_script[i].weight, + sql_script[i].weight, 100.0 * sql_script[i].weight / total_weight, sql_script[i].stats.cnt, 100.0 * sql_script[i].stats.cnt / total->cnt, sql_script[i].stats.cnt / time_include); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
- that it does work:-) I'm not sure what happens by the script selection process, it should be checked carefully because it was not designed with allowing a zero weight, and it may depend on its/their positions. It may already work, but it really needs checking. Hmmm, it seems ok. Attached is an updated patch, which: - I would suggest that a warning is shown when a weight is zero, something like "warning, script #%d weight is zero, will be ignored". includes such a warning. - the documentation should be updated:-) adds a line about 0 weight in the documentation. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index c6d1454..e31d5ee 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -698,6 +698,7 @@ pgbench options dbname Each script may be given a relative weight specified after a @ so as to change its drawing probability. The default weight is 1. + Weight 0 scripts are ignored. diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 4196b0e..0c1a0ee 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2953,10 +2953,10 @@ parseScriptWeight(const char *option, char **script) fprintf(stderr, "invalid weight specification: %s\n", sep); exit(1); } - if (wtmp > INT_MAX || wtmp <= 0) + if (wtmp > INT_MAX || wtmp < 0) { fprintf(stderr, - "weight specification out of range (1 .. %u): " INT64_FORMAT "\n", + "weight specification out of range (0 .. %u): " INT64_FORMAT "\n", INT_MAX, (int64) wtmp); exit(1); } @@ -2987,6 +2987,13 @@ addScript(ParsedScript script) exit(1); } + if (script.weight == 0) + { + fprintf(stderr, +"warning, script \"%s\" (%d) weight is zero, will be ignored\n", +script.desc, num_scripts); + } + sql_script[num_scripts] = script; num_scripts++; } @@ -3527,6 +3534,12 @@ main(int argc, char **argv) /* cannot overflow: weight is 32b, total_weight 64b */ total_weight += sql_script[i].weight; + if (total_weight == 0) + { + fprintf(stderr, "total weight for scripts (-b and -f options) cannot be zero\n"); + exit(1); + } + /* show per script stats if several scripts are used */ if (num_scripts > 1) per_script_stats = true; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers