Re: [HACKERS] ALTER TABLESPACE MOVE command tag tweak
Fujii, * Fujii Masao (masao.fu...@gmail.com) wrote: On Fri, Aug 22, 2014 at 8:14 AM, Stephen Frost sfr...@snowman.net wrote: * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: ALTER TABLE ALL IN TABLESPACE xyz which AFAICS should work since ALL is already a reserved keyword. Pushed to master and REL9_4_STABLE. You seem to have forgotten to update tab-complete.c. Attached patch updates that. Pushed. Thanks again, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Built-in binning functions
On 04/09/14 21:16, Pavel Stehule wrote: I did a review of last patch Thanks I found only one issue - float8 path has no own test in regress tests. When this issue will be fixed, I will mark this patch as ready for commit Ah yeah I forgot about those, fixed in attached patch. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index e50408c..93af6b4 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -901,25 +901,36 @@ indexterm primarywidth_bucket/primary /indexterm -literalfunctionwidth_bucket(parameterop/parameter typenumeric/type, parameterb1/parameter typenumeric/type, parameterb2/parameter typenumeric/type, parametercount/parameter typeint/type)/function/literal +literalfunctionwidth_bucket(parameteroperand/parameter typenumeric/type, parameterb1/parameter typenumeric/type, parameterb2/parameter typenumeric/type, parametercount/parameter typeint/type)/function/literal /entry entrytypeint/type/entry entryreturn the bucket to which parameteroperand/ would be assigned in an equidepth histogram with parametercount/ - buckets, in the range parameterb1/ to parameterb2//entry + buckets spanning the range parameterb1/ to parameterb2//entry entryliteralwidth_bucket(5.35, 0.024, 10.06, 5)/literal/entry entryliteral3/literal/entry /row row - entryliteralfunctionwidth_bucket(parameterop/parameter typedp/type, parameterb1/parameter typedp/type, parameterb2/parameter typedp/type, parametercount/parameter typeint/type)/function/literal/entry + entryliteralfunctionwidth_bucket(parameteroperand/parameter typedp/type, parameterb1/parameter typedp/type, parameterb2/parameter typedp/type, parametercount/parameter typeint/type)/function/literal/entry entrytypeint/type/entry entryreturn the bucket to which parameteroperand/ would be assigned in an equidepth histogram with parametercount/ - buckets, in the range parameterb1/ to parameterb2//entry + buckets spanning the range parameterb1/ to parameterb2//entry entryliteralwidth_bucket(5.35, 0.024, 10.06, 5)/literal/entry entryliteral3/literal/entry /row + + row + entryliteralfunctionwidth_bucket(parameteroperand/parameter typeanyelement/type, parameterthresholds/parameter typeanyarray/type)/function/literal/entry + entrytypeint/type/entry + entryreturn the bucket to which parameteroperand/ would + be assigned given an array listing the upper bounds of the buckets; + the parameterthresholds/ array emphasismust be sorted/, + smallest first, or unexpected results will be obtained/entry + entryliteralwidth_bucket(now(), array['yesterday', 'today', 'tomorrow']::timestamptz[])/literal/entry + entryliteral2/literal/entry + /row /tbody /tgroup /table diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index f8e94ec..57376ea 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -15,8 +15,10 @@ #include postgres.h #include ctype.h +#include math.h #include access/htup_details.h +#include catalog/pg_type.h #include funcapi.h #include libpq/pqformat.h #include utils/array.h @@ -130,6 +132,15 @@ static ArrayType *array_replace_internal(ArrayType *array, Datum replace, bool replace_isnull, bool remove, Oid collation, FunctionCallInfo fcinfo); +static int width_bucket_fixed(Datum operand, + ArrayType *thresholds, + Oid collation, + TypeCacheEntry *typentry); +static int width_bucket_fixed_float8(Datum operand, ArrayType *thresholds); +static int width_bucket_variable(Datum operand, + ArrayType *thresholds, + Oid collation, + TypeCacheEntry *typentry); /* @@ -5502,3 +5513,219 @@ array_replace(PG_FUNCTION_ARGS) fcinfo); PG_RETURN_ARRAYTYPE_P(array); } + +/* + * Implements width_bucket(anyelement, anyarray). + * + * 'thresholds' is an array containing upper bound values for each bucket; + * these must be sorted from smallest to largest, or bogus results will be + * produced. If N thresholds are supplied, the output is from 0 to N: + * 0 is for inputs first threshold, N is for inputs = last threshold. + */ +Datum +width_bucket_generic(PG_FUNCTION_ARGS) +{ + Datum operand = PG_GETARG_DATUM(0); + ArrayType *thresholds = PG_GETARG_ARRAYTYPE_P(1); + Oid element_type = ARR_ELEMTYPE(thresholds); + int result; + + /* Check input */ + if (ARR_NDIM(thresholds) 1) + ereport(ERROR, +(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg(thresholds must be one-dimensional array))); + + if (array_contains_nulls(thresholds)) + ereport(ERROR, +(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg(thresholds array
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 6.9.2014 23:34, Andrew Gierth wrote: Tomas == Tomas Vondra t...@fuzzy.cz writes: Tomas I have significant doubts about the whole design, Tomas though. Especially the decision not to use HashAggregate, There is no decision not to use HashAggregate. There is simply no support for HashAggregate yet. Having it be able to work with GroupAggregate is essential, because there are always cases where HashAggregate is simply not permitted (e.g. when using distinct or sorted aggs; or unhashable types; or with the current code, when the estimated memory usage exceeds work_mem). HashAggregate may be a performance improvement, but it's something that can be added afterwards rather than an essential part of the feature. Ah, OK. I got confused by the final patch subject, and so the possibility of additional optimization somehow didn't occur to me. Tomas Now, the chaining only makes this worse, because it Tomas effectively forces a separate sort of the whole table for each Tomas grouping set. It's not one sort per grouping set, it's the minimal number of sorts needed to express the result as a union of ROLLUP clauses. The planner code will (I believe) always find the smallest number of sorts needed. You're probably right. Although when doing GROUP BY CUBE (a,b,c,a) I get one more ChainAggregate than with CUBE(a,b,c). and we seem to compute all the aggregates twice. Not sure if we need to address this though, because it's mostly user's fault. Each aggregate node can process any number of grouping sets as long as they represent a single rollup list (and therefore share a single sort order). Yes, this is slower than using one hashagg. But it solves the general problem in a way that does not interfere with future optimization. (HashAggregate can be added to the current implementation by first adding executor support for hashagg with multiple grouping sets, then in the planner, extracting as many hashable grouping sets as possible from the list before looking for rollup lists. The chained aggregate code can work just fine with a HashAggregate as the chain head. We have not actually tackled this, since I'm not going to waste any time adding optimizations before the basic idea is accepted.) OK, understood. Tomas What I envisioned when considering hacking on this a few Tomas months back, was extending the aggregate API with merge Tomas state function, That's not really on the cards for arbitrary non-trivial aggregate functions. Yes, it can be done for simple ones, and if you want to use that as a basis for adding optimizations that's fine. But a solution that ONLY works in simple cases isn't sufficient, IMO. I believe it can be done for most aggregates, assuming you have access to the internal state somehow (not just the). Adding it for in-core aggregates would not be difficult, in most cases. But you're right we don't have this now at all. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
Tomas == Tomas Vondra t...@fuzzy.cz writes: It's not one sort per grouping set, it's the minimal number of sorts needed to express the result as a union of ROLLUP clauses. The planner code will (I believe) always find the smallest number of sorts needed. Tomas You're probably right. Although when doing GROUP BY CUBE Tomas (a,b,c,a) I get one more ChainAggregate than with Tomas CUBE(a,b,c). and we seem to compute all the aggregates Tomas twice. Not sure if we need to address this though, because Tomas it's mostly user's fault. Hm. Yeah, you're right that the number of sorts is not optimal there. We can look into that. As for computing it all twice, there's currently no attempt to optimize multiple identical grouping sets into multiple projections of a single grouping set result. CUBE(a,b,c,a) has twice as many grouping sets as CUBE(a,b,c) does, even though all the extra ones are duplicates. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing implicit 'text' - xml|json|jsonb
On 09/07/2014 02:24 AM, Tom Lane wrote: The problem here seems to be only related to mistyped parameters. Can we contain the damage to that part only somehow? Or make this optional (defaulting to off, I hope)? I'd love to make it affect only parameters, actually, for v3 protocol bind/parse/execute. That would be ideal. Well, let's talk about that. Doing something with parameter type assignment seems a lot less likely to result in unexpected side-effects than introducing a dozen new implicit casts. I think it'd meet the needs of the group of users I see running into issues and would minimise impact, so that sounds good if it's practical. However, see below. It looks like just sending 'unknown' instead of 'text' for strings from drivers might be the way to go. My concerns about introducing new overloads may have been unfounded. Right now the main workaround is to send all string-typed parameters as 'unknown'-typed, but that causes a mess with function overload resolution, and it's wrong most of the time when the parameter really is just text. If you think adding implicit casts *won't* cause a mess with function overload resolution, I wonder why. Really though it seems like the question is how much clarity there is on the client side about what data types parameters should have. I get the impression that liberal use of unknown is really about the right thing in a lot of client APIs ... So we'd be going down the path of asking client drivers to change how they bound string-type parameters to 'unknown' by default, or asking users to routinely change that default. Thinking about it some more, that's really no different to how things work right now when you write unparameterised queries using literals without an explicit type-specifier or cast. Pg will resolve an unknown-typed literal to text if there's ambiguity and one of the choices is a text-type. e.g. with md5(text) vs md5(bytea), if you call it with an unknown-typed literal the text form is chosen: regress= SELECT md5('abcdef'); md5 -- e80b5017098950fc58aad83c8c14978e (1 row) same as if you bind an explicitly unknown-typed parameter: regress= PREPARE md5p(unknown) AS SELECT md5($1); PREPARE regress= EXECUTE md5p('abcdef'); md5 -- e80b5017098950fc58aad83c8c14978e (1 row) In fact, to my surprise, using 'unknown' won't break callers who currently send an explicit 'text' type when there's a 'varchar' overload of the function: regress= create or replace function identity(varchar) returns text language plpgsql as $$ begin raise notice 'varchar'; return $1; end; $$; CREATE FUNCTION regress= create or replace function identity(text) returns text language plpgsql as $$ begin raise notice 'text'; return $1; end; $$; CREATE FUNCTION regress= SELECT identity('fred'); NOTICE: text identity -- fred (1 row) regress= PREPARE identity_text(text) AS SELECT identity($1); PREPARE craig= EXECUTE identity_text('fred'); NOTICE: text identity -- fred (1 row) regress= PREPARE identity_unknown(unknown) AS SELECT identity($1); PREPARE craig= EXECUTE identity_unknown('fred'); NOTICE: text identity -- fred (1 row) regress= PREPARE identity_varchar(varchar) AS SELECT identity($1); PREPARE regress= EXECUTE identity_varchar('fred'); NOTICE: varchar identity -- fred (1 row) So - if a driver currently sends 'varchar' for string types, and the user has a 'varchar' and 'text' overload of the same function defined, it'd change the overload selected. That's a (tiny) BC break. Perhaps the solution here is just to make 'unknown' the default for stirng-types in client drivers, make sure people have a way to change it back, and relnote it clearly in the driver release? In PgJDBC that's just a matter of changing the default for 'stringtype' to 'unknown' in the 9.4 release. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 7.9.2014 15:11, Andrew Gierth wrote: Tomas == Tomas Vondra t...@fuzzy.cz writes: It's not one sort per grouping set, it's the minimal number of sorts needed to express the result as a union of ROLLUP clauses. The planner code will (I believe) always find the smallest number of sorts needed. Tomas You're probably right. Although when doing GROUP BY CUBE Tomas (a,b,c,a) I get one more ChainAggregate than with Tomas CUBE(a,b,c). and we seem to compute all the aggregates Tomas twice. Not sure if we need to address this though, because Tomas it's mostly user's fault. Hm. Yeah, you're right that the number of sorts is not optimal there. We can look into that. I don't think it's very critical, though. I was worried about it because of the sorts, but if that gets tackled in patches following this commitfest it seems OK. As for computing it all twice, there's currently no attempt to optimize multiple identical grouping sets into multiple projections of a single grouping set result. CUBE(a,b,c,a) has twice as many grouping sets as CUBE(a,b,c) does, even though all the extra ones are duplicates. Shouldn't this be solved by eliminating the excessive ChainAggregate? Although it probably changes GROUPING(...), so it's not just about removing the duplicate column(s) from the CUBE. Maybe preventing this completely (i.e. raising an ERROR with duplicate columns in CUBE/ROLLUP/... clauses) would be appropriate. Does the standard says anything about this? But arguably this is a minor issue, happening only when the user uses the same column/expression twice. Hopefully the users don't do that too often. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving PL/PgSQL
On 2014-09-06 9:47 PM, Pavel Stehule wrote: -1 .. to proposal .. It is in contradiction with current feature. Which feature would that be? Next it is nonsense. INTO clause should to contains only plpgsql variables - in 9.x Postgres there is not possible issue. postgres=# create table x(a int, b int); CREATE TABLE postgres=# insert into x values(10,20); INSERT 0 1 postgres=# create or replace function foo(out a int, out b int) postgres-# returns record as $$ postgres$# begin postgres$# select x.a, x.b from x into a, b; postgres$# return; postgres$# end; postgres$# $$ language plpgsql; CREATE FUNCTION postgres=# select * from foo(); a | b + 10 | 20 (1 row) you can see, there is not any collision No, not if you do assignments only. But if you also use them as parameters in plans at some point in the function, there will be collisions. But I consider that secondary. I think the bigger improvement is that it's clear what assigning to OUT.foo does when reading the code. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Selectivity estimation for inet operators
I updated the patch to cover semi and anti joins with eqjoinsel_semi(). I think it is better than returning a constant. What you did there is utterly unacceptable from a modularity standpoint; and considering that the values will be nowhere near right, the argument that it's better than returning a constant seems pretty weak. I think you should just take that out again. I will try to come up with a better, data type specific implementation in a week. New version with semi join estimation function attached. diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c index d0d806f..f6cb2f6 100644 --- a/src/backend/utils/adt/network_selfuncs.c +++ b/src/backend/utils/adt/network_selfuncs.c @@ -1,32 +1,819 @@ /*- * * network_selfuncs.c * Functions for selectivity estimation of inet/cidr operators * - * Currently these are just stubs, but we hope to do better soon. + * Estimates are based on null fraction, distinct value count, most common + * values, and histogram of inet/cidr datatypes. * * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * * * IDENTIFICATION * src/backend/utils/adt/network_selfuncs.c * *- */ #include postgres.h +#include math.h + +#include access/htup_details.h +#include catalog/pg_collation.h +#include catalog/pg_operator.h +#include catalog/pg_statistic.h +#include utils/lsyscache.h #include utils/inet.h +#include utils/selfuncs.h + + +/* Default selectivity constant for the inet overlap operator */ +#define DEFAULT_OVERLAP_SEL 0.01 + +/* Default selectivity constant for the other operators */ +#define DEFAULT_INCLUSION_SEL 0.005 + +/* Default selectivity for given operator */ +#define DEFAULT_SEL(operator) \ + ((operator) == OID_INET_OVERLAP_OP ? \ + DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL) +static Selectivity networkjoinsel_inner(Oid operator, + VariableStatData *vardata1, VariableStatData *vardata2); +static Selectivity networkjoinsel_semi(Oid operator, + VariableStatData *vardata1, VariableStatData *vardata2); +static short int inet_opr_order(Oid operator); +static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues, + Datum *constvalue, short int opr_order); +static Selectivity inet_mcv_join_selec(Datum *mcv1_values, + float4 *mcv1_numbers, int mcv1_nvalues, Datum *mcv2_values, + float4 *mcv2_numbers, int mcv2_nvalues, Oid operator, + Selectivity *max_selec_p); +static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, + int mcv_nvalues, Datum *his_values, int his_nvalues, + short int opr_order, Selectivity *max_selec_p); +static Selectivity inet_his_inclusion_join_selec(Datum *his1_values, + int his1_nvalues, Datum *his2_values, int his2_nvalues, + short int opr_order); +static Selectivity inet_semi_join_selec(bool mcv2_exists, Datum *mcv2_values, + int mcv2_nvalues, bool his2_exists, Datum *his2_values, + int his2_nvalues, double his2_weight, Datum *constvalue, + FmgrInfo *proc, short int opr_order); +static short int inet_inclusion_cmp(inet *left, inet *right, + short int opr_order); +static short int inet_masklen_inclusion_cmp(inet *left, inet *right, + short int opr_order); +static short int inet_his_match_divider(inet *boundary, inet *query, + short int opr_order); +/* + * Selectivity estimation for the subnet inclusion operators + */ Datum networksel(PG_FUNCTION_ARGS) { - PG_RETURN_FLOAT8(0.001); + PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0); + Oid operator = PG_GETARG_OID(1); + List *args = (List *) PG_GETARG_POINTER(2); + int varRelid = PG_GETARG_INT32(3), +his_nvalues; + VariableStatData vardata; + Node *other; + bool varonleft; + Selectivity selec, +max_mcv_selec; + Datum constvalue, + *his_values; + Form_pg_statistic stats; + double nullfrac; + FmgrInfo proc; + + /* + * If expression is not (variable op something) or (something op + * variable), then punt and return a default estimate. + */ + if (!get_restriction_variable(root, args, varRelid, + vardata, other, varonleft)) + PG_RETURN_FLOAT8(DEFAULT_SEL(operator)); + + /* + * Can't do anything useful if the something is not a constant, either. + */ + if (!IsA(other, Const)) + { + ReleaseVariableStats(vardata); + PG_RETURN_FLOAT8(DEFAULT_SEL(operator)); + } + + /* All of the subnet inclusion operators are strict. */ + if (((Const *) other)-constisnull) + { + ReleaseVariableStats(vardata); + PG_RETURN_FLOAT8(0.0); + } + + if (!HeapTupleIsValid(vardata.statsTuple)) + { + ReleaseVariableStats(vardata); + PG_RETURN_FLOAT8(DEFAULT_SEL(operator)); + } + + constvalue = ((Const *) other)-constvalue; + stats = (Form_pg_statistic)
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
Tomas == Tomas Vondra t...@fuzzy.cz writes: As for computing it all twice, there's currently no attempt to optimize multiple identical grouping sets into multiple projections of a single grouping set result. CUBE(a,b,c,a) has twice as many grouping sets as CUBE(a,b,c) does, even though all the extra ones are duplicates. Tomas Shouldn't this be solved by eliminating the excessive Tomas ChainAggregate? Although it probably changes GROUPING(...), Tomas so it's not just about removing the duplicate column(s) from Tomas the CUBE. Eliminating the excess ChainAggregate would not change the number of grouping sets, only where they are computed. Tomas Maybe preventing this completely (i.e. raising an ERROR with Tomas duplicate columns in CUBE/ROLLUP/... clauses) would be Tomas appropriate. Does the standard says anything about this? The spec does not say anything explicitly about duplicates, so they are allowed (and duplicate grouping _sets_ can't be removed, only duplicate columns within a single GROUP BY clause after the grouping sets have been eliminated by transformation). I have checked my reading of the spec against oracle 11 and MSSQL using sqlfiddle. The way the spec handles grouping sets is to define a sequence of syntactic transforms that result in a query which is a UNION ALL of ordinary GROUP BY queries. (We haven't tried to implement the additional optional feature of GROUP BY DISTINCT.) Since it's UNION ALL, any duplicates must be preserved, so a query with GROUPING SETS ((a),(a)) reduces to: SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a; and therefore has duplicates of all its result rows. I'm quite prepared to concede that I may have read the spec wrong (wouldn't be the first time), but in this case I require any such claim to be backed up by an example from some other db showing an actual difference in behavior. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding a nullable DOMAIN column w/ CHECK
Noah Misch n...@leadboat.com writes: On Sat, Sep 06, 2014 at 02:01:32AM +0200, Marko Tiikkaja wrote: To do this optimization we do have to assume that CHECKs in DOMAINs are at least STABLE, but I don't see that as a problem; those should be IMMUTABLE anyway, I think. The system has such assumptions already. What bothers me about this general approach is that the check condition is evaluated against a null whether or not there are any rows in the table. This means that side-effects of the check condition would happen even when they did not happen in the previous implementation. Maybe that's all right, but to say it's all right you must make a stronger form of the check conditions are immutable assumption than we make elsewhere, ie not just that its result won't change but that it has no visible evaluation side-effects. So I disagree with Noah's conclusion that we're already assuming this. As an example, if the check condition is such that it actually throws an error (not just returns false) for null input, the ALTER command would fail outright, whereas it would previously have succeeded as long as the table is empty. (BTW, should not the patch be checking for a false result?) This objection could be met by doing a precheck to verify that the table contains at least one live row. That's pretty ugly and personally I'm not sure it's necessary, but I think there's room to argue that it is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgcrypto: PGP signatures
On Wed, Sep 3, 2014 at 2:13 PM, Marko Tiikkaja ma...@joh.to wrote: On 2014-09-03 10:33 PM, Jeff Janes wrote: On Wed, Sep 3, 2014 at 12:43 PM, Marko Tiikkaja ma...@joh.to wrote: Right. This patch only adds support for signing data when encrypting it at the same time. There's no support for detached signatures, nor is there support for anything other than signatures of encrypted data. I should have been more clear on that in my initial email. :-( OK, thanks. How hard do you think it would to allow NULL (or empty string?) passwords to gpg_sym_signatures and gpg_sym_decrypt_verify to accommodate this? To sign without encrypting? To verify signatures of things that are not encrypted. I'm not really interested in storing private keys in PostgreSQL, just things that can be done with public keys. (But I will make a dummy private key for testing if I get that far.) ... Once I wrap it in dearmor, I get the ERROR: No signature matching the key id present in the message The public key block I am giving it is for the keyid that is reported by pgp_sym_signatures, so I don't know what the problem might be. Have you tried with the debug=1 option? (It's undocumented, but it was like that before this patch and I didn't touch it). I have now, but it didn't produce any output for this situation. I have two theories for the problem. My test signed message was signed with a keyring that had a signing subkey, so it was signed with that, not with the master. Maybe it doesn't like that. Also, I created the signed message in gpg, then imported it to PostgreSQL, and maybe it doesn't like that. I've never used the pgp functions of pgcrypto before, so I decided to take a step back and try some of the functions that predate the proposed patch. And I can't get them to work well, either. If I use pgp_sym_encrypt to encrypt a message with AES, then pgp_sym_decrypt will decrypt, and so will gpg command line tool. But if I use gpg to encrypt a message, pgp_sym_decrypt will not decrypt it. select pgp_sym_decrypt(dearmor('-BEGIN PGP MESSAGE- Version: GnuPG v2.0.14 (GNU/Linux) Password: foobar jA0EBwMCqywsAv/hXJ7D0j8BWsD+9H7DY4KhrIIw2oV/6tBueVQ28+VDjBw9rGiy 3JRPmyXNN4wRTZXIyTVzK3LylWLomD9pQkao4hrQwSs= =02RI -END PGP MESSAGE- '),'foobar','debug=1'); NOTICE: dbg: parse_literal_data: data type=b ERROR: Not text data So I don't know if I am doing something wrong, or if the PostgreSQL implementation of pgp is just not interoperable with other implementations. That makes it hard to test the new features if I can't make the old ones work. The two messages I am working with are: Created: echo -n 'a message'|gpg -c --armor --cipher-algo AES - -BEGIN PGP MESSAGE- Version: GnuPG v2.0.14 (GNU/Linux) Password: foobar jA0EBwMCqywsAv/hXJ7D0j8BWsD+9H7DY4KhrIIw2oV/6tBueVQ28+VDjBw9rGiy 3JRPmyXNN4wRTZXIyTVzK3LylWLomD9pQkao4hrQwSs= =02RI -END PGP MESSAGE- and Created: select armor(pgp_sym_encrypt('a message','foobar')); -BEGIN PGP MESSAGE- ww0EBwMCYzgp4dU3zCJ30joBViH28prwc9jIHhzUyXt31omiHao7NeOuLhCR0/uhAB6GRfYAXWVa x+FTsW27F46/W7dlRjxCuzcu =jQGZ -END PGP MESSAGE- Cheers, Jeff
Re: [HACKERS] pgcrypto: PGP signatures
On 2014-09-07 19:28, Jeff Janes wrote: On Wed, Sep 3, 2014 at 2:13 PM, Marko Tiikkaja ma...@joh.to wrote: To sign without encrypting? To verify signatures of things that are not encrypted. I'm not really interested in storing private keys in PostgreSQL, just things that can be done with public keys. (But I will make a dummy private key for testing if I get that far.) Right. That functionality might be useful, but I think it should be a separate patch completely. (And I doubt I have any interest in implementing it). Once I wrap it in dearmor, I get the ERROR: No signature matching the key id present in the message The public key block I am giving it is for the keyid that is reported by pgp_sym_signatures, so I don't know what the problem might be. Have you tried with the debug=1 option? (It's undocumented, but it was like that before this patch and I didn't touch it). I have now, but it didn't produce any output for this situation. I have two theories for the problem. My test signed message was signed with a keyring that had a signing subkey, so it was signed with that, not with the master. Maybe it doesn't like that. Yeah, this patch only supports signing and verifying signatures with main keys. Also, I created the signed message in gpg, then imported it to PostgreSQL, and maybe it doesn't like that. That should not be a problem. I used gpg extensively when testing the patch. I've never used the pgp functions of pgcrypto before, so I decided to take a step back and try some of the functions that predate the proposed patch. And I can't get them to work well, either. If I use pgp_sym_encrypt to encrypt a message with AES, then pgp_sym_decrypt will decrypt, and so will gpg command line tool. But if I use gpg to encrypt a message, pgp_sym_decrypt will not decrypt it. select pgp_sym_decrypt(dearmor('-BEGIN PGP MESSAGE- Version: GnuPG v2.0.14 (GNU/Linux) Password: foobar jA0EBwMCqywsAv/hXJ7D0j8BWsD+9H7DY4KhrIIw2oV/6tBueVQ28+VDjBw9rGiy 3JRPmyXNN4wRTZXIyTVzK3LylWLomD9pQkao4hrQwSs= =02RI -END PGP MESSAGE- '),'foobar','debug=1'); NOTICE: dbg: parse_literal_data: data type=b ERROR: Not text data So I don't know if I am doing something wrong, or if the PostgreSQL implementation of pgp is just not interoperable with other implementations. That makes it hard to test the new features if I can't make the old ones work. The NOTICE here says what's wrong: the message has been marked to contain binary data, not text. You should be able to decrypt it with pgp_sym_decrypt_bytea() (and you can use convert_from() to get a text value out). .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for psql History Display on MacOSX
Noah Misch n...@leadboat.com writes: I ran libedit-history-fixes-v3.patch through my previous libedit-28 test. Now, patched psql writes ^A for newlines in any command. Here's the new matrix of behaviors when recalling history: master using master-written history: oldest command: ok rest: ok v3-patched using master-written history: oldest command: ok rest: ok master using v3-patched-written history oldest command: ok rest: each broken if it contained a newline v3-patched using v3-patched-written history oldest command: ok rest: ok That's probably the same result you saw. How does it compare to the compatibility effects for other libedit versions you tested? Yeah, this is the behavior I'm expecting; libedit-13 and up all seem to work like this. It seems that in very old libedit versions (5.1, Tiger era) both the existing loop and the patched version are able to iterate over more than just the oldest command, though not necessarily the entire history --- I've not figured out exactly what happens, and am not sure it's worth bothering. This means that in a ~/.psql_history made with such a version, at least some commands besides the oldest might've had newlines converted to ^A. Such history files are currently broken when forward-ported to any later libedit version; with the proposed patch, though, we would read them correctly. This gain makes me think the patch is worth the backwards-compatibility loss you mention. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 7.9.2014 18:52, Andrew Gierth wrote: Tomas == Tomas Vondra t...@fuzzy.cz writes: Tomas Maybe preventing this completely (i.e. raising an ERROR with Tomas duplicate columns in CUBE/ROLLUP/... clauses) would be Tomas appropriate. Does the standard says anything about this? The spec does not say anything explicitly about duplicates, so they are allowed (and duplicate grouping _sets_ can't be removed, only duplicate columns within a single GROUP BY clause after the grouping sets have been eliminated by transformation). I have checked my reading of the spec against oracle 11 and MSSQL using sqlfiddle. The way the spec handles grouping sets is to define a sequence of syntactic transforms that result in a query which is a UNION ALL of ordinary GROUP BY queries. (We haven't tried to implement the additional optional feature of GROUP BY DISTINCT.) Since it's UNION ALL, any duplicates must be preserved, so a query with GROUPING SETS ((a),(a)) reduces to: SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a; and therefore has duplicates of all its result rows. I'm quite prepared to concede that I may have read the spec wrong (wouldn't be the first time), but in this case I require any such claim to be backed up by an example from some other db showing an actual difference in behavior. I think you read the spec right. Apparently duplicate grouping sets are allowed, and it's supposed to output that grouping set twice. The section on ROLLUP/CUBE do not mention duplicates at all, it only explains how to generate all the possible grouping sets, so if you have duplicate columns there, you'll get duplicate sets (which is allowed). If we can get rid of the excessive ChainAggregate, that's certainly enough for now. Optimizing it could be simple, though - you don't need to keep the duplicate groups, you only need to keep a counter how many times to output this group. But the more I think about it, the more I think we can ignore that. There are far more important pieces to implement, and if you write bad SQL there's no help anyway. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Built-in binning functions
2014-09-07 14:29 GMT+02:00 Petr Jelinek p...@2ndquadrant.com: On 04/09/14 21:16, Pavel Stehule wrote: I did a review of last patch Thanks I found only one issue - float8 path has no own test in regress tests. When this issue will be fixed, I will mark this patch as ready for commit Ah yeah I forgot about those, fixed in attached patch. ok It is ready for commit Thank you Pavel -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Built-in binning functions
On 2014-09-07 20:45:38 +0200, Pavel Stehule wrote: 2014-09-07 14:29 GMT+02:00 Petr Jelinek p...@2ndquadrant.com: Ah yeah I forgot about those, fixed in attached patch. ok It is ready for commit Tom, are you planning to take this one? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Built-in binning functions
Andres Freund and...@2ndquadrant.com writes: Tom, are you planning to take this one? Yeah, I already looked at it once, so will take it. I think the main remaining issue is that we don't have consensus on the function name AFAICT. I'm okay with using width_bucket(), as is done in the latest patch, but there were objections ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Built-in binning functions
On 2014-09-07 15:05:35 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Tom, are you planning to take this one? Yeah, I already looked at it once, so will take it. Cool. I think the main remaining issue is that we don't have consensus on the function name AFAICT. I'm okay with using width_bucket(), as is done in the latest patch, but there were objections ... Just reread that part of the thread and personally I disliked all the other suggested names more than width_bucket. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding a nullable DOMAIN column w/ CHECK
On Sun, Sep 07, 2014 at 01:06:04PM -0400, Tom Lane wrote: Noah Misch n...@leadboat.com writes: On Sat, Sep 06, 2014 at 02:01:32AM +0200, Marko Tiikkaja wrote: To do this optimization we do have to assume that CHECKs in DOMAINs are at least STABLE, but I don't see that as a problem; those should be IMMUTABLE anyway, I think. The system has such assumptions already. What bothers me about this general approach is that the check condition is evaluated against a null whether or not there are any rows in the table. This means that side-effects of the check condition would happen even when they did not happen in the previous implementation. Maybe that's all right, but to say it's all right you must make a stronger form of the check conditions are immutable assumption than we make elsewhere, ie not just that its result won't change but that it has no visible evaluation side-effects. So I disagree with Noah's conclusion that we're already assuming this. Our assumption that domain CHECK constraints are STABLE doesn't grant unlimited freedom to evaluate them, indeed. As an example, if the check condition is such that it actually throws an error (not just returns false) for null input, the ALTER command would fail outright, whereas it would previously have succeeded as long as the table is empty. (BTW, should not the patch be checking for a false result?) This objection could be met by doing a precheck to verify that the table contains at least one live row. That's pretty ugly and personally I'm not sure it's necessary, but I think there's room to argue that it is. Yes; I doubt one could justify failing on an empty table as though it had been a one-row table. I see a couple ways we could avoid the I/O and complexity: 1) If contain_leaky_functions() approves every constraint expression, test the constraints once, and we're done. Otherwise, proceed as we do today. 2) Test the constraints in a subtransaction. If the subtransaction commits, we're done. Otherwise, proceed as we do today. The more complexity you accept, the more cases you optimize; where best to draw the line is not clear to me at this point. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding a nullable DOMAIN column w/ CHECK
Noah Misch n...@leadboat.com writes: On Sun, Sep 07, 2014 at 01:06:04PM -0400, Tom Lane wrote: This objection could be met by doing a precheck to verify that the table contains at least one live row. That's pretty ugly and personally I'm not sure it's necessary, but I think there's room to argue that it is. Yes; I doubt one could justify failing on an empty table as though it had been a one-row table. I see a couple ways we could avoid the I/O and complexity: 1) If contain_leaky_functions() approves every constraint expression, test the constraints once, and we're done. Otherwise, proceed as we do today. 2) Test the constraints in a subtransaction. If the subtransaction commits, we're done. Otherwise, proceed as we do today. I'm not sure either of those is better than doing a single heap_getnext(), which really should be pretty cheap except under pathological conditions. It's the messiness I'm worried about more than the cost. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Built-in binning functions
On 07/09/14 21:09, Andres Freund wrote: On 2014-09-07 15:05:35 -0400, Tom Lane wrote: I think the main remaining issue is that we don't have consensus on the function name AFAICT. I'm okay with using width_bucket(), as is done in the latest patch, but there were objections ... Just reread that part of the thread and personally I disliked all the other suggested names more than width_bucket. Same here, that's why I didn't change it. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] gist vacuum gist access
hello. i recode vacuum for gist index.all tests is ok.also i test vacuum on table size 2 million rows. all is ok.on my machine old vaccum work about 9 second. this version work about 6-7 sec .review please. thanks.*** a/src/backend/access/gist/gistvacuum.c --- b/src/backend/access/gist/gistvacuum.c *** *** 101,134 gistvacuumcleanup(PG_FUNCTION_ARGS) PG_RETURN_POINTER(stats); } - typedef struct GistBDItem - { - GistNSN parentlsn; - BlockNumber blkno; - struct GistBDItem *next; - } GistBDItem; - - static void - pushStackIfSplited(Page page, GistBDItem *stack) - { - GISTPageOpaque opaque = GistPageGetOpaque(page); - - if (stack-blkno != GIST_ROOT_BLKNO !XLogRecPtrIsInvalid(stack-parentlsn) - (GistFollowRight(page) || stack-parentlsn GistPageGetNSN(page)) - opaque-rightlink != InvalidBlockNumber /* sanity check */ ) - { - /* split page detected, install right link to the stack */ - - GistBDItem *ptr = (GistBDItem *) palloc(sizeof(GistBDItem)); - - ptr-blkno = opaque-rightlink; - ptr-parentlsn = stack-parentlsn; - ptr-next = stack-next; - stack-next = ptr; - } - } - - /* * Bulk deletion of all index entries pointing to a set of heap tuples and * check invalid tuples left after upgrade. --- 101,106 *** *** 145,283 gistbulkdelete(PG_FUNCTION_ARGS) IndexBulkDeleteCallback callback = (IndexBulkDeleteCallback) PG_GETARG_POINTER(2); void *callback_state = (void *) PG_GETARG_POINTER(3); Relation rel = info-index; ! GistBDItem *stack, ! *ptr; ! ! /* first time through? */ if (stats == NULL) stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult)); - /* we'll re-count the tuples each time */ stats-estimated_count = false; stats-num_index_tuples = 0; ! stack = (GistBDItem *) palloc0(sizeof(GistBDItem)); ! stack-blkno = GIST_ROOT_BLKNO; ! ! while (stack) ! { ! Buffer buffer; ! Page page; ! OffsetNumber i, ! maxoff; ! IndexTuple idxtuple; ! ItemId iid; ! ! buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack-blkno, ! RBM_NORMAL, info-strategy); ! LockBuffer(buffer, GIST_SHARE); ! gistcheckpage(rel, buffer); ! page = (Page) BufferGetPage(buffer); ! ! if (GistPageIsLeaf(page)) { ! OffsetNumber todelete[MaxOffsetNumber]; ! int ntodelete = 0; ! ! LockBuffer(buffer, GIST_UNLOCK); ! LockBuffer(buffer, GIST_EXCLUSIVE); page = (Page) BufferGetPage(buffer); - if (stack-blkno == GIST_ROOT_BLKNO !GistPageIsLeaf(page)) - { - /* only the root can become non-leaf during relock */ - UnlockReleaseBuffer(buffer); - /* one more check */ - continue; - } - - /* - * check for split proceeded after look at parent, we should check - * it after relock - */ - pushStackIfSplited(page, stack); ! /* ! * Remove deletable tuples from page ! */ ! ! maxoff = PageGetMaxOffsetNumber(page); ! ! for (i = FirstOffsetNumber; i = maxoff; i = OffsetNumberNext(i)) { ! iid = PageGetItemId(page, i); ! idxtuple = (IndexTuple) PageGetItem(page, iid); ! ! if (callback((idxtuple-t_tid), callback_state)) { ! todelete[ntodelete] = i - ntodelete; ! ntodelete++; ! stats-tuples_removed += 1; } - else - stats-num_index_tuples += 1; - } - - if (ntodelete) - { - START_CRIT_SECTION(); - - MarkBufferDirty(buffer); - - for (i = 0; i ntodelete; i++) - PageIndexTupleDelete(page, todelete[i]); - GistMarkTuplesDeleted(page); ! if (RelationNeedsWAL(rel)) { ! XLogRecPtr recptr; ! recptr = gistXLogUpdate(rel-rd_node, buffer, ! todelete, ntodelete, ! NULL, 0, InvalidBuffer); ! PageSetLSN(page, recptr); ! } ! else ! PageSetLSN(page, gistGetFakeLSN(rel)); ! ! END_CRIT_SECTION(); ! } ! } ! else ! { ! /* check for split proceeded after look at parent */ ! pushStackIfSplited(page, stack); ! ! maxoff = PageGetMaxOffsetNumber(page); ! for (i = FirstOffsetNumber; i = maxoff; i = OffsetNumberNext(i)) ! { ! iid = PageGetItemId(page, i); ! idxtuple = (IndexTuple) PageGetItem(page, iid); ! ptr = (GistBDItem *) palloc(sizeof(GistBDItem)); ! ptr-blkno = ItemPointerGetBlockNumber((idxtuple-t_tid)); ! ptr-parentlsn = PageGetLSN(page); ! ptr-next = stack-next; ! stack-next = ptr; ! if (GistTupleIsInvalid(idxtuple)) ! ereport(LOG, ! (errmsg(index \%s\ contains an inner tuple marked as invalid, ! RelationGetRelationName(rel)), ! errdetail(This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1.), ! errhint(Please REINDEX it.))); } - } - - UnlockReleaseBuffer(buffer); ! ptr = stack-next; ! pfree(stack); ! stack = ptr; ! ! vacuum_delay_point(); } PG_RETURN_POINTER(stats); } --- 117,208
Re: [HACKERS] Join push-down support for foreign tables
(2014/09/04 21:37), Robert Haas wrote: On Wed, Sep 3, 2014 at 5:16 AM, Shigeru Hanada shigeru.han...@gmail.com wrote: (1) Separate cost estimation phases? For existing join paths, planner estimates their costs in two phaeses. In the first phase initial_cost_foo(), here foo is one of nestloop/mergejoin/hashjoin, produces lower-bound estimates for elimination. The second phase is done for only promising paths which passed add_path_precheck(), by final_cost_foo() for cost and result size. I'm not sure that we need to follow this manner, since FDWs would be able to estimate final cost/size with their own methods. The main problem I see here is that accurate costing may require a round-trip to the remote server. If there is only one path that is probably OK; the cost of asking the question will usually be more than paid for by hearing that the pushed-down join clobbers the other possible methods of executing the query. But if there are many paths, for example because there are multiple sets of useful pathkeys, it might start to get a bit expensive. I agree that requiring round-trip per path is unbearable, so main source of plan cost should be local statistics gathered by ANALYZE command. If an FDW needs extra information for planning, it should obtain that from FDW options or its private catalogs to avoid undesirable round-trips. FDWs like postgres_fdw would want to optimize plan by providing paths with pathkeys (not only use remote index, but it also allows MergeJoin at upper level), I noticed that order of join considering is an issue too. Planner compares currently-cheapest path to newly generated path, and mostly foreign join path would be the cheapest, so considering foreign join would reduce planner overhead. Probably both the initial cost and final cost calculations should be delegated to the FDW, but maybe within postgres_fdw, the initial cost should do only the work that can be done without contacting the remote server; then, let the final cost step do that if appropriate. But I'm not entirely sure what is best here. Agreed. I'll design planner API along that way for now. (2) How to reflect cost of transfer Cost of transfer is dominant in foreign table operations, including foreign scans. It would be nice to have some mechanism to reflect actual time of transfer to the cost estimation. An idea is to have a FDW option which represents cost factor of transfer, say transfer_cost. That would be reasonable. I assume users would normally wish to specify this per-server, and the default should be something reasonable for a LAN. This enhancement could be applied separately from foreign join patch. (4) criteria for push-down It is assumed that FDWs can push joins down to remote when all foreign tables are in same server. IMO a SERVER objects represents a logical data source. For instance database for postgres_fdw and other connection-based FDWs, and disk volumes (or directory?) for file_fdw. Is this reasonable assumption? I think it's probably good to give an FDW the option of producing a ForeignJoinPath for any join against a ForeignPath *or ForeignJoinPath* for the same FDW. It's perhaps unlikely that an FDW can perform a join efficiently between two data sources with different server definitions, but why not give it the option? It should be pretty fast for the FDW to realize, oh, the server OIDs don't match - and at that point it can exit without doing anything further if that seems desirable. And there might be some kinds of data sources where cross-server joins actually can be executed quickly (e.g. when the underlying data is just in two files in different places on the local machine). Indeed how to separate servers is left to users, or author of FDWs, though postgres_fdw and most of other FDWs can join foreign tables in a server. I think it would be good if we can know two foreign tables are managed by same FDW, from FdwRoutine, maybe adding new API which returns FDW identifier? (5) Terminology I used foreign join as a process which joins foreign tables on *remote* side, but is this enough intuitive? Another idea is using remote join, is this more appropriate for this kind of process? I hesitate to use remote join because it implies client-server FDWs, but foreign join is not limited to such FDWs, e.g. file_fdw can have extra file which is already joined files accessed via foreign tables. Foreign join is perfect. As I alluded to above, it's pretty important to make sure that this works with large join trees; that is, if I join four foreign tables, I don't want it to push down a join between two of the tables and a join between the other two tables and then join the results of those joins locally. Instead, I want to push the entire join tree to the foreign server and execute the whole thing there. Some care may be needed in designing the hooks to make sure this works as desired. I think so
Re: [HACKERS] Join push-down support for foreign tables
(2014/09/05 0:56), Bruce Momjian wrote: On Thu, Sep 4, 2014 at 08:41:43PM +0530, Atri Sharma wrote: On Thursday, September 4, 2014, Bruce Momjian br...@momjian.us wrote: On Thu, Sep 4, 2014 at 08:37:08AM -0400, Robert Haas wrote: The main problem I see here is that accurate costing may require a round-trip to the remote server. If there is only one path that is probably OK; the cost of asking the question will usually be more than paid for by hearing that the pushed-down join clobbers the other possible methods of executing the query. But if there are many paths, for example because there are multiple sets of useful pathkeys, it might start to get a bit expensive. Probably both the initial cost and final cost calculations should be delegated to the FDW, but maybe within postgres_fdw, the initial cost should do only the work that can be done without contacting the remote server; then, let the final cost step do that if appropriate. But I'm not entirely sure what is best here. I am thinking eventually we will need to cache the foreign server statistics on the local server. Wouldn't that lead to issues where the statistics get outdated and we have to anyways query the foreign server before planning any joins? Or are you thinking of dropping the foreign table statistics once the foreign join is complete? I am thinking we would eventually have to cache the statistics, then get some kind of invalidation message from the foreign server. I am also thinking that cache would have to be global across all backends, I guess similar to our invalidation cache. If a FDW needs to know more information than pg_statistics and pg_class have, yes, it should cache some statistics on the local side. But such statistics would have FDW-specific shape so it would be hard to have API to manage. FDW can have their own functions and tables to manage their own statistics, and it can have even background-worker for messaging. But it would be another story. Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] what data type should be returned by sum(float4)
Hello, http://www.postgresql.org/docs/devel/static/functions-aggregate.html The documentation says that return type of sum(expression) is... bigint for smallint or int arguments, numeric for bigint arguments, double precision for floating-point arguments, otherwise the same as the argument data type Does it expect sum(float4) returns float8, doesn't it? On the other hand, catalog definition seems to me it returns float4. postgres=# select * from pg_aggregate where aggfnoid in (2110, 2111); aggfnoid| aggkind | aggnumdirectargs | aggtransfn | aggfinalfn | aggmtra nsfn | aggminvtransfn | aggmfinalfn | aggfinalextra | aggmfinalextra | aggsortop | aggtranstype | aggtransspace | aggmtranstype | aggmtransspace | agginitval | aggminitval +-+--+++ -++-+---++-- -+--+---+---+++- pg_catalog.sum | n |0 | float4pl | - | - | - | - | f | f | 0 | 700 | 0 | 0 | 0 || pg_catalog.sum | n |0 | float8pl | - | - | - | - | f | f | 0 | 701 | 0 | 0 | 0 || (2 rows) Which one shall be fixed? Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] what data type should be returned by sum(float4)
Kouhei Kaigai kai...@ak.jp.nec.com writes: The documentation says that return type of sum(expression) is... bigint for smallint or int arguments, numeric for bigint arguments, double precision for floating-point arguments, otherwise the same as the argument data type Does it expect sum(float4) returns float8, doesn't it? Good catch! [ digs in commit history... ] It looks like the claim that sum(float4) produces float8 was introduced in my commit d06ebdb8d3425185d7e641d15e45908658a0177d of 2000-07-17; but it seems to have been an outright error, because sum(float4) was a separate aggregate long before that. Possibly a copy-and-paste mistake from some other aggregate? I sure don't remember. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal (9.5) : psql unicode border line styles
Pavel, * Pavel Stehule (pavel.steh...@gmail.com) wrote: 2014-07-23 8:38 GMT+02:00 Tomas Vondra t...@fuzzy.cz: OK, thanks. The new version seems OK to me. Thank you I've started looking over the patch and went back through the previous thread about it. For my part, I'm in favor of adding this capability, but I'm not terribly happy about how it was done. In particular, get_line_style() seems pretty badly hacked around, and I don't really like having the prepare_unicode_format call underneath it allocating memory and then passing back up the need to free that memory via a new field in the structure. Also, on a quick glance, are you sure that the new 'unicode' output matches the same as the old 'unicode' did (with pg_utf8format)? I would think we'd simply set up a structure which is updated when the linestyle is changed, which is surely going to be much less frequently than the request for which linestyle to use happens, and handle all of the line styles in more-or-less the same way rather than doing something completely different for unicode than for the others. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors
Pavel, All, * Pavel Stehule (pavel.steh...@gmail.com) wrote: Thank you I made a few minor adjustments, but the bigger issue (in my view) is what to do about array_to_json_pretty- seems like we should do the same there, no? Perhaps you could propose a new patch which incorporates my minor changes and also removes array_to_json_pretty and makes array_to_json take an optional argument? Thoughts? Thanks, Stephen diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index e50408c..c6c44a9 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -10294,11 +10294,13 @@ table2-mapping /row row entry - literalrow_to_json(record [, pretty_bool])/literal + literalrow_to_json(rowval record [, pretty bool [, ignore_nulls bool] ])/literal /entry entry Returns the row as a JSON object. Line feeds will be added between - level-1 elements if parameterpretty_bool/parameter is true. + level-1 elements if parameterpretty_bool/parameter is true. Elements + with NULL values will be skipped when parameterignore_nulls/parameter + is true. /entry entryliteralrow_to_json(row(1,'foo'))/literal/entry entryliteral{f1:1,f2:foo}/literal/entry diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 1bde175..02cf965 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -867,3 +867,10 @@ RETURNS interval LANGUAGE INTERNAL STRICT IMMUTABLE AS 'make_interval'; + +CREATE OR REPLACE FUNCTION + row_to_json(rowval record, pretty boolean DEFAULT false, ignore_nulls boolean DEFAULT false) +RETURNS json +LANGUAGE INTERNAL +STRICT STABLE +AS 'row_to_json'; diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c index 494a028..ae6028e 100644 --- a/src/backend/utils/adt/json.c +++ b/src/backend/utils/adt/json.c @@ -79,7 +79,8 @@ static void report_invalid_token(JsonLexContext *lex); static int report_json_context(JsonLexContext *lex); static char *extract_mb_char(char *s); static void composite_to_json(Datum composite, StringInfo result, - bool use_line_feeds); + bool use_line_feeds, + bool ignore_nulls); static void array_dim_to_json(StringInfo result, int dim, int ndims, int *dims, Datum *vals, bool *nulls, int *valcount, JsonTypeCategory tcategory, Oid outfuncoid, @@ -1362,7 +1363,7 @@ datum_to_json(Datum val, bool is_null, StringInfo result, array_to_json_internal(val, result, false); break; case JSONTYPE_COMPOSITE: - composite_to_json(val, result, false); + composite_to_json(val, result, false, false); break; case JSONTYPE_BOOL: outputstr = DatumGetBool(val) ? true : false; @@ -1591,7 +1592,8 @@ array_to_json_internal(Datum array, StringInfo result, bool use_line_feeds) * Turn a composite / record into JSON. */ static void -composite_to_json(Datum composite, StringInfo result, bool use_line_feeds) +composite_to_json(Datum composite, StringInfo result, bool use_line_feeds, + bool ignore_nulls) { HeapTupleHeader td; Oid tupType; @@ -1630,6 +1632,12 @@ composite_to_json(Datum composite, StringInfo result, bool use_line_feeds) if (tupdesc-attrs[i]-attisdropped) continue; + val = heap_getattr(tuple, i + 1, tupdesc, isnull); + + /* Don't serialize NULL field when we don't want it */ + if (isnull ignore_nulls) + continue; + if (needsep) appendStringInfoString(result, sep); needsep = true; @@ -1638,8 +1646,6 @@ composite_to_json(Datum composite, StringInfo result, bool use_line_feeds) escape_json(result, attname); appendStringInfoChar(result, ':'); - val = heap_getattr(tuple, i + 1, tupdesc, isnull); - if (isnull) { tcategory = JSONTYPE_NULL; @@ -1721,34 +1727,19 @@ array_to_json_pretty(PG_FUNCTION_ARGS) } /* - * SQL function row_to_json(row) + * SQL function row_to_json(rowval record, pretty bool, ignore_nulls bool) */ extern Datum row_to_json(PG_FUNCTION_ARGS) { Datum array = PG_GETARG_DATUM(0); - StringInfo result; - - result = makeStringInfo(); - - composite_to_json(array, result, false); - - PG_RETURN_TEXT_P(cstring_to_text_with_len(result-data, result-len)); -} - -/* - * SQL function row_to_json(row, prettybool) - */ -extern Datum -row_to_json_pretty(PG_FUNCTION_ARGS) -{ - Datum array = PG_GETARG_DATUM(0); bool use_line_feeds = PG_GETARG_BOOL(1); + bool ignore_nulls = PG_GETARG_BOOL(2); StringInfo result; result = makeStringInfo(); - composite_to_json(array, result, use_line_feeds); + composite_to_json(array, result, use_line_feeds, ignore_nulls); PG_RETURN_TEXT_P(cstring_to_text_with_len(result-data, result-len)); } diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 5176ed0..5aeadc3 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -4203,10 +4203,8 @@ DATA(insert OID =
Re: [HACKERS] pgcrypto: PGP signatures
On Sun, Sep 7, 2014 at 10:36 AM, Marko Tiikkaja ma...@joh.to wrote: On 2014-09-07 19:28, Jeff Janes wrote: select pgp_sym_decrypt(dearmor('-BEGIN PGP MESSAGE- Version: GnuPG v2.0.14 (GNU/Linux) Password: foobar jA0EBwMCqywsAv/hXJ7D0j8BWsD+9H7DY4KhrIIw2oV/6tBueVQ28+VDjBw9rGiy 3JRPmyXNN4wRTZXIyTVzK3LylWLomD9pQkao4hrQwSs= =02RI -END PGP MESSAGE- '),'foobar','debug=1'); NOTICE: dbg: parse_literal_data: data type=b ERROR: Not text data So I don't know if I am doing something wrong, or if the PostgreSQL implementation of pgp is just not interoperable with other implementations. That makes it hard to test the new features if I can't make the old ones work. The NOTICE here says what's wrong: the message has been marked to contain binary data, not text. You should be able to decrypt it with pgp_sym_decrypt_bytea() (and you can use convert_from() to get a text value out). OK, thanks. That is obvious in retrospect. I'll put it on my todo list to try to clean up some of documentation and error messages to make it more obvious to the naive user, but that is not part of this patch. One problem I've run into now is that if I try to sign a message with pgp_pub_encrypt_sign but give it the public, not private, key as the 3rd argument, it generates this message: ERROR: Cannot decrypt with public key Should be 'sign', not 'decrypt'. Similarly for verification: ERROR: Refusing to encrypt with secret key 'encrypt' should be 'verify signature'. Cheers, Jeff
[HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors
Hi 2014-09-08 5:48 GMT+02:00 Stephen Frost sfr...@snowman.net: Pavel, All, * Pavel Stehule (pavel.steh...@gmail.com) wrote: Thank you I made a few minor adjustments, but the bigger issue (in my view) is what to do about array_to_json_pretty- seems like we should do the same there, no? Perhaps you could propose a new patch which incorporates my minor changes and also removes array_to_json_pretty and makes array_to_json take an optional argument? I though about it, and I am not sure. If somebody uses arrays as set, then it can be good idea, when it is used as array, then you cannot to throw a nulls from array. I am thinking, so it is not necessary, because we can remove NULLs from array simply via iteration over array (what is impossible for row fields) CREATE OR REPLACE FUNCTION remove_null(anyarray) RETURNS anyarray AS $$ SELECT ARRAY(SELECT a FROM unnest($1) g(a) WHERE a IS NOT NULL) $$ LANGUAGE plpgsql; or this function can be in core for general usage. ignore_nulls in array_to_json_pretty probably is not necessary. On second hand, the cost is zero, and we can have it for API consistency. Regards Pavel Thoughts? Thanks, Stephen
Re: [HACKERS] [BUGS] BUG #10823: Better REINDEX syntax.
* Vik Fearing (vik.fear...@dalibo.com) wrote: On 09/02/2014 10:17 PM, Marko Tiikkaja wrote: Yeah, I think I like this better than allowing all of them without the database name. Why? It's just a noise word! Eh, because it ends up reindexing system tables too, which is probably not what new folks are expecting. Also, it's not required when you say 'user tables', so it's similar to your user_tables v1 patch in that regard. Yes, I will update the patch. Still planning to do this..? Marking this back to waiting-for-author. Thanks! Stephen signature.asc Description: Digital signature
[HACKERS] Re: proposal: ignore null fields in not relation type composite type based constructors
* Pavel Stehule (pavel.steh...@gmail.com) wrote: ignore_nulls in array_to_json_pretty probably is not necessary. On second hand, the cost is zero, and we can have it for API consistency. I'm willing to be persuaded either way on this, really. I do think it would be nice for both array_to_json and row_to_json to be single functions which take default values, but as for if array_to_json has a ignore_nulls option, I'm on the fence and would defer to people who use that function regularly (I don't today). Beyond that, I'm pretty happy moving forward with this patch. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_background (and more parallelism infrastructure patches)
On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote: + pq_mq_busy = true; + + iov[0].data = msgtype; + iov[0].len = 1; + iov[1].data = s; + iov[1].len = len; + + Assert(pq_mq_handle != NULL); + result = shm_mq_sendv(pq_mq_handle, iov, 2, false); + + pq_mq_busy = false; Don't you need a PG_TRY block here to reset pq_mq_busy? No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more. But since that should only happen if an interrupt arrives while the queue is full, I think that's OK. I think here not only on interrupt, but any other error in this function shm_mq_sendv() path (one example is WaitLatch()) could lead to same behaviour. (Think about the alternatives: if the queue is full, we have no way of notifying the launching process without waiting for it to retrieve the results, but it might not do that right away, and if we've been killed we need to die *now* not later.) So in such cases what is the advise to users, currently they will see the below message: postgres=# select * from pg_background_result(5124) as (x int); ERROR: lost connection to worker process with PID 5124 One way is to ask them to check logs, but what about if they want to handle error and take some action? Another point about error handling is that to execute the sql in function pg_background_worker_main(), it starts the transaction which I think doesn't get aborted if error occurs and similarly handling for timeout seems to be missing in error path. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com