Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape
On Mon, Jul 13, 2015 at 7:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: The two contrib modules this patch added are nowhere near fit for public consumption. They cannot clean up after themselves when dropped: ... Raw inserts into system catalogs just aren't a sane thing to do in extensions. I had some thoughts about how we might fix that, without going to the rather tedious lengths of creating a complete set of DDL infrastructure for CREATE/DROP TABLESAMPLE METHOD. First off, the extension API designed for the tablesample patch is evidently modeled on the index AM API, which was not a particularly good precedent --- it's not a coincidence that index AMs can't be added or dropped on-the-fly. Modeling a server internal API as a set of SQL-visible functions is problematic when the call signatures of those functions can't be accurately described by SQL datatypes, and it's rather pointless and inefficient when none of the functions in question is meant to be SQL-callable. It's even less attractive if you don't think you've got a completely stable API spec, because adding, dropping, or changing signature of any one of the API functions then involves a pile of easy-to-mess-up catalog changes on top of the actually useful work. Not to mention then having to think about backwards compatibility of your CREATE command's arguments. We have a far better model to follow already, namely the foreign data wrapper API. In that, there's a single SQL-visible function that returns a dummy datatype indicating that it's an FDW handler, and when called, it hands back a struct containing pointers to all the other functions that the particular wrapper needs to supply (and, if necessary, the struct could have non-function-pointer fields containing other info the core system might need to know about the handler). We could similarly invent a pseudotype tablesample_handler and represent each tablesample method by a single SQL function returning tablesample_handler. Any future changes in the API for tablesample handlers would then appear as changes in the C definition of the struct returned by the handler, which requires no SQL-visible thrashing, hence creates no headaches for pg_upgrade, and it is pretty easy to design it so that failure to update an external module that implements the API results in C compiler errors and/or sane fallback behavior. That's elegant. Once we've done that, I think we don't even need a special catalog representing tablesample methods. Given FROM foo TABLESAMPLE bernoulli(...), the parser could just look for a function bernoulli() returning tablesample_handler, and it's done. The querytree would have an ordinary function dependency on that function, which would be sufficient to handle DROP dependency behaviors properly. (On reflection, maybe better if it's bernoulli(internal) returns tablesample_handler, so as to guarantee that such a function doesn't create a conflict with any user-defined function of the same name.) Thoughts? Regarding the fact that those two contrib modules can be part of a -contrib package and could be installed, nuking those two extensions from the tree and preventing the creating of custom tablesample methods looks like a correct course of things to do for 9.5. When looking at this patch I was as well surprised that the BERNOUILLI method can only be applied on a physical relation and was not able to fire on a materialized set of tuples, say the result of a WITH clause for example. PS: now that I've written this rant, I wonder why we don't redesign the index AM API along the same lines. It probably doesn't matter much at the moment, but if we ever get serious about supporting index AM extensions, I think we ought to consider doing that. Any API redesign looks to be clearly 9.6 work IMO at this point. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On Mon, Jul 13, 2015 at 3:39 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Tue, Jul 7, 2015 at 9:07 PM, Simon Riggs si...@2ndquadrant.com wrote: On 6 July 2015 at 17:28, Simon Riggs si...@2ndquadrant.com wrote: I think we need something for pg_upgrade to rewrite existing VMs. Otherwise a large read only database would suddenly require a massive revacuum after upgrade, which seems bad. That can wait for now until we all agree this patch is sound. Since we need to rewrite the vm map, I think we should call the new map vfm That way we will be able to easily check whether the rewrite has been conducted on all relations. Since the maps are just bits there is no other way to tell that a map has been rewritten To avoid revacuum after upgrade, you meant that we need to rewrite each bit of vm to corresponding bits of vfm, if it's from not-supporting vfm version(i.g., 9.5 or earlier ). right? If so, we will need to do whole scanning table, which is expensive as well. Clearing vm and do revacuum would be nice, rather than doing in upgrading, I think. How will you ensure to have revacuum for all the tables after upgrading? Till the time Vacuum is done on the tables that have vm before upgrade, any queries on those tables can become slower. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] intarray planning/exeuction problem with multiple operators
On Sunday 12 July 2015 23:12:41 Jeff Janes wrote: I've found an interesting performance problem in the intarray extension module. It doesn't seem to be version dependent, I've verified it in 9.4.4 and 9.6devel. Hello. Look at my patch. Maybe he solves this problem. https://commitfest.postgresql.org/5/253/ -- Uriy Zhuravlev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
On Mon, Jul 13, 2015 at 9:22 PM, Sawada Masahiko sawada.m...@gmail.com wrote: I might missing something but, these functions will generate WAL? If they does, we will face the situation where we need to wait forever, Fujii-san pointed out. No, those functions are here to manipulate the metadata defining the quorum/priority set. We definitely do not want something that generates WAL. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Enhanced ALTER OPERATOR
Hello hackers. Attached is a new version of patch: * port syntax from NULL to truth NONE * fix documentation (thanks Heikki) * rebase to master Thanks. -- Uriy Zhuravlev Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/doc/src/sgml/ref/alter_operator.sgml b/doc/src/sgml/ref/alter_operator.sgml index bdb2d02..237e4f1 100644 --- a/doc/src/sgml/ref/alter_operator.sgml +++ b/doc/src/sgml/ref/alter_operator.sgml @@ -26,6 +26,11 @@ ALTER OPERATOR replaceablename/replaceable ( { replaceableleft_type/repla ALTER OPERATOR replaceablename/replaceable ( { replaceableleft_type/replaceable | NONE } , { replaceableright_type/replaceable | NONE } ) SET SCHEMA replaceablenew_schema/replaceable + +ALTER OPERATOR replaceablename/replaceable ( { replaceableleft_type/replaceable | NONE } , { replaceableright_type/replaceable | NONE } ) +SET ( { RESTRICT = { replaceable class=parameterres_proc/replaceable | NONE } + | JOIN = { replaceable class=parameterjoin_proc/replaceable | NONE } + } [, ... ] ) /synopsis /refsynopsisdiv @@ -34,8 +39,7 @@ ALTER OPERATOR replaceablename/replaceable ( { replaceableleft_type/repla para commandALTER OPERATOR/command changes the definition of - an operator. The only currently available functionality is to change the - owner of the operator. + an operator. /para para @@ -98,6 +102,25 @@ ALTER OPERATOR replaceablename/replaceable ( { replaceableleft_type/repla /para /listitem /varlistentry + + varlistentry + termreplaceable class=parameterres_proc/replaceable/term + listitem + para + The restriction selectivity estimator function for this operator; write NONE to remove existing selectivity estimator. + /para + /listitem + /varlistentry + + varlistentry + termreplaceable class=parameterjoin_proc/replaceable/term + listitem + para + The join selectivity estimator function for this operator; write NONE to remove existing selectivity estimator. + /para + /listitem + /varlistentry + /variablelist /refsect1 @@ -109,6 +132,13 @@ ALTER OPERATOR replaceablename/replaceable ( { replaceableleft_type/repla programlisting ALTER OPERATOR @@ (text, text) OWNER TO joe; /programlisting/para + + para +Change the restriction and join selectivity estimator function of a custom operator literala b/literal for type typeint[]/type: +programlisting +ALTER OPERATOR (_int4, _int4) SET (RESTRICT = _int_contsel, JOIN = _int_contjoinsel); +/programlisting/para + /refsect1 refsect1 diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c index 072f530..4c5c9c6 100644 --- a/src/backend/catalog/pg_operator.c +++ b/src/backend/catalog/pg_operator.c @@ -28,8 +28,10 @@ #include catalog/pg_operator.h #include catalog/pg_proc.h #include catalog/pg_type.h +#include commands/defrem.h #include miscadmin.h #include parser/parse_oper.h +#include parser/parse_func.h #include utils/acl.h #include utils/builtins.h #include utils/lsyscache.h @@ -53,7 +55,7 @@ static Oid OperatorShellMake(const char *operatorName, Oid leftTypeId, Oid rightTypeId); -static void OperatorUpd(Oid baseId, Oid commId, Oid negId); +static void ShellOperatorUpd(Oid baseId, Oid commId, Oid negId); static Oid get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId, @@ -563,7 +565,7 @@ OperatorCreate(const char *operatorName, commutatorId = operatorObjectId; if (OidIsValid(commutatorId) || OidIsValid(negatorId)) - OperatorUpd(operatorObjectId, commutatorId, negatorId); + ShellOperatorUpd(operatorObjectId, commutatorId, negatorId); return address; } @@ -633,7 +635,7 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId, } /* - * OperatorUpd + * ShellOperatorUpd * * For a given operator, look up its negator and commutator operators. * If they are defined, but their negator and commutator fields @@ -642,7 +644,7 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId, * which are the negator or commutator of each other. */ static void -OperatorUpd(Oid baseId, Oid commId, Oid negId) +ShellOperatorUpd(Oid baseId, Oid commId, Oid negId) { int i; Relation pg_operator_desc; @@ -864,3 +866,154 @@ makeOperatorDependencies(HeapTuple tuple) return myself; } + +/* + * Operator update aka ALTER OPERATOR for RESTRICT, JOIN + */ +void OperatorUpd(Oid classId, + Oid baseId, + List *operator_params) +{ + int i; + ListCell *pl; + Relation catalog; + HeapTuple tup; + Oid operator_param_id = 0; + bool nulls[Natts_pg_operator]; + bool replaces[Natts_pg_operator]; + Datum values[Natts_pg_operator]; + + for (i = 0; i Natts_pg_operator; ++i) + { + values[i] = (Datum) 0; + replaces[i] = false; + nulls[i] = false; + } + + catalog = heap_open(classId, RowExclusiveLock); + tup =
Re: [HACKERS] multivariate statistics / patch v7
Hi, Thanks for the detailed explaination. I misunderstood the code (more honest speaking, din't look so close there). Then I looked it closer. At Wed, 08 Jul 2015 03:03:16 +0200, Tomas Vondra tomas.von...@2ndquadrant.com wrote in 559c76d4.2030...@2ndquadrant.com FWIW this was a stupid bug in update_match_bitmap_histogram(), which initially handled only AND clauses, and thus assumed the match of a bucket can only decrease. But for OR clauses this is exactly the opposite (we assume no buckets match and add buckets matching at least one of the clauses). With this fixed, the estimates look like this: IMHO pretty accurate estimates - no issue with OR clauses. Ok, I understood the diferrence between what I thought and what you say. The code is actually concious of OR clause but is looks somewhat confused. Currently choosing mv stats in clauselist_selectivity can be outlined as following, 1. find_stats finds candidate mv stats containing *all* attributes appeared in the whole clauses regardless of and/or exprs by walking whole the clause tree. Perhaps this is the measure to early bailout. 2.1. Within every disjunction elements, collect mv-related attributes while checking whether the all leaf nodes (binop or ifnull) are compatible by (eventually) walking whole the clause tree. 2.2. Check if all the collected attribute are contained in mv-stats columns. 3. Finally, clauseset_mv_selectivity_histogram() (and others). This funciton applies every ExprOp onto every attribute in every histogram backes and (tries to) make the boolean operation of the result bitmaps. I have some comments on the implement and I also try to find the solution for them. 1. The flow above looks doing very similiar thins repeatedly. 2. I believe what the current code does can be simplified. 3. As you mentioned in comments, some additional infrastructure needed. After all, I think what we should do after this are as follows, as the first step. - Add the means to judge the selectivity operator(?) by other than oprrest of the op of ExprOp. (You missed neqsel already) I suppose one solution for this is adding oprmvstats taking 'm', 'h' and 'f' and their combinations. Or for the convenience, it would be a fixed-length string like this. oprname | oprmvstats = | 'mhf' | 'mhf' | 'mh-' | 'mh-' = | 'mh-' = | 'mh-' This would make the code in clause_is_mv_compatible like this. oprmvstats = get_mvstatsset(expr-opno); /* bitwise representation */ if (oprmvstats types) { *attnums = bms_add_member(*attnums, var-varattno); return true; } return false; - Current design just manage to work but it is too complicated and hardly have affinity with the existing estimation framework. I proposed separation of finding stats phase and calculation phase, but I would like to propose transforming RestrictInfo(and finding mvstat) phase and running the transformed RestrictInfo phase after looking close to the patch. I think transforing RestrictInfo makes the situnation better. Since it nedds different information, maybe it is better to have new struct, say, RestrictInfoForEstimate (boo!). Then provide mvstatssel() to use in the new struct. The rough looking of the code would be like below. clauselist_selectivity() { ... RestrictInfoForEstmate *esclause = transformClauseListForEstimation(root, clauses, varRelid); ... return clause_selectivity(esclause): } clause_selectivity(RestrictInfoForEstmate *esclause) { if (IsA(clause, RestrictInfo))... if (IsA(clause, RestrictInfoForEstimate)) { RestrictInfoForEstimate *ecl = (RestrictInfoForEstimate*) clause; if (ecl-selfunc) { sx = ecl-selfunc(root, ecl); } } if (IsA(clause, Var))... } transformClauseListForEstimation(...) { ... relid = collect_mvstats_info(root, clause, attlist); if (!relid) return; if (get_mvstats_hook) mvstats = (*get_mvstats_hoook) (root, relid, attset); else mvstats = find_mv_stats(root, relid, attset)) } ... I've pushed this to github [1] but I need to do some additional fixes. I also had to remove some optimizations while fixing this, and will have to reimplement those. That's not to say that the handling of OR-clauses is perfectly correct. After looking at clauselist_selectivity_or(), I believe it's a bit broken and will need a bunch of fixes, as explained in the FIXMEs I pushed to github. [1] https://github.com/tvondra/postgres/tree/mvstats I don't see whether it is doable or not, and I suppose you're unwilling to change the big picture, so I will consider the idea and will show you the result, if it turns out to be possible and promising. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On Mon, Jul 13, 2015 at 3:26 PM, Ildus Kurbangaliev i.kurbangal...@postgrespro.ru wrote: On 07/12/2015 06:53 AM, Amit Kapila wrote: For having duration, I think you need to use gettimeofday or some similar call to calculate the wait time, now it will be okay for the cases where wait time is longer, however it could be problematic for the cases if the waits are very small (which could probably be the case for LWLocks) gettimeofday already used in our patch and it gives enough accuracy (in microseconds), especially when lwlock become a problem. Also we tested our realization and it gives overhead less than 1%. ( http://www.postgresql.org/message-id/559d4729.9080...@postgrespro.ru, testing part). I think that test is quite generic, we should test more combinations (like use -M prepared option as that can stress LWLock machinery somewhat more) and other type of tests which can stress the part of code where gettimeofday() is used in patch. We need help here with testing on other platforms. I used gettimeofday because of builtin module instr_time.h that already gives cross-platform tested functions for measuring, but I'm planning to make similar implementation for monotonic functions based on clock_gettime for more accuracy. 2) Accumulate per backend statistics about each wait event type: number of occurrences and total duration. With this statistics user can identify system bottlenecks again without sampling. Number #2 will be provided as a separate patch. Number #1 require different concurrency model. ldus will extract it from waits monitoring patch shortly. Sure, I think those should be evaluated as separate patches, and I can look into those patches and see if something more can be exposed as part of this patch which we can be reused in those patches. If you agree I'l do some modifications to your patch, so we can later extend it with our other modifications. Main issue is that one variable for all types is not enough. For flexibity in the future we need at least two - class and event, for example class=LWLock, event=ProcArrayLock, or class=Storage, and event=READ. I have already proposed something very similar in this thread [1] (where instead of class, I have used wait_event_type) to which Robert doesn't agree, so here I think before writing code, it seems prudent to get an agreement about what kind of User-Interface would satisfy the requirement and will be extendible for future as well. I think it will be better if you can highlight some points about what kind of user-interface is better (extendible) and the reasons for same. [1] (Refer option-3) - http://www.postgresql.org/message-id/caa4ek1j6cg_jym00nrwt4n8r78zn4ljoqy_zu1xrzxfq+me...@mail.gmail.com With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Freeze avoidance of very large table.
On Mon, Jul 13, 2015 at 9:03 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Mon, Jul 13, 2015 at 7:46 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Jul 13, 2015 at 3:39 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Tue, Jul 7, 2015 at 9:07 PM, Simon Riggs si...@2ndquadrant.com wrote: On 6 July 2015 at 17:28, Simon Riggs si...@2ndquadrant.com wrote: I think we need something for pg_upgrade to rewrite existing VMs. Otherwise a large read only database would suddenly require a massive revacuum after upgrade, which seems bad. That can wait for now until we all agree this patch is sound. Since we need to rewrite the vm map, I think we should call the new map vfm That way we will be able to easily check whether the rewrite has been conducted on all relations. Since the maps are just bits there is no other way to tell that a map has been rewritten To avoid revacuum after upgrade, you meant that we need to rewrite each bit of vm to corresponding bits of vfm, if it's from not-supporting vfm version(i.g., 9.5 or earlier ). right? If so, we will need to do whole scanning table, which is expensive as well. Clearing vm and do revacuum would be nice, rather than doing in upgrading, I think. How will you ensure to have revacuum for all the tables after upgrading? We use script file which are generated by pg_upgrade. I haven't followed this thread closely, but I am sure you recall that vacuumdb has a parallel mode. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On Tue, Jul 7, 2015 at 9:07 PM, Simon Riggs si...@2ndquadrant.com wrote: On 6 July 2015 at 17:28, Simon Riggs si...@2ndquadrant.com wrote: I think we need something for pg_upgrade to rewrite existing VMs. Otherwise a large read only database would suddenly require a massive revacuum after upgrade, which seems bad. That can wait for now until we all agree this patch is sound. Since we need to rewrite the vm map, I think we should call the new map vfm That way we will be able to easily check whether the rewrite has been conducted on all relations. Since the maps are just bits there is no other way to tell that a map has been rewritten To avoid revacuum after upgrade, you meant that we need to rewrite each bit of vm to corresponding bits of vfm, if it's from not-supporting vfm version(i.g., 9.5 or earlier ). right? If so, we will need to do whole scanning table, which is expensive as well. Clearing vm and do revacuum would be nice, rather than doing in upgrading, I think. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
On Fri, Jul 10, 2015 at 10:06 PM, Beena Emerson memissemer...@gmail.com wrote: Hello, Tue, Jul 7, 2015 at 02:56 AM, Josh Berkus wrote: pro-JSON: * standard syntax which is recognizable to sysadmins and devops. * can use JSON/JSONB functions with ALTER SYSTEM SET to easily make additions/deletions from the synch rep config. * can add group labels (see below) Adding group labels do have a lot of values but as Amit has pointed out, with little modification, they can be included in GUC as well. It will not make it any more complex. On Tue, Jul 7, 2015 at 2:19 PM, Michael Paquier wrote: Something like pg_syncinfo/ coupled with a LW lock, we already do something similar for replication slots with pg_replslot/. I was trying to figure out how the JSON metadata can be used. It would have to be set using a given set of functions. Right? I am sorry this question is very basic. The functions could be something like: 1. pg_add_synch_set(set_name NAME, quorum INT, is_priority bool, set_members VARIADIC) This will be used to add a sync set. The set_members can be individual elements of another set name. The parameter is_priority is used to decide whether the set is priority (true) set or quorum (false). This function call will create a folder pg_syncinfo/groups/$NAME and store the json blob? The root group would be automatically sset by finding the group which is not included in other groups? or can be set by another function? 2. pg_modify_sync_set(set_name NAME, quorum INT, is_priority bool, set_members VARIADIC) This will update the pg_syncinfo/groups/$NAME to store the new values. 3. pg_drop_synch_set(set_name NAME) This will update the pg_syncinfo/groups/$NAME folder. Also all the groups which included this would be updated? 4. pg_show_synch_set() this will display the current sync setting in json format. Am I missing something? Is JSON being preferred because it would be ALTER SYSTEM friendly and in a format already known to users? In a real-life scenario, at most how many groups and nesting would be expected? I might missing something but, these functions will generate WAL? If they does, we will face the situation where we need to wait forever, Fujii-san pointed out. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 2015-07-13 21:03:07 +0900, Sawada Masahiko wrote: Even If we implement rewriting tool for vm into pg_upgrade, it will take time as much as revacuum because it need whole scanning table. Why would it? Sure, you can only set allvisible and not the frozen bit, but that's fine. That way the cost for freezing can be paid over time. If we require terrabytes of data to be scanned, including possibly rewriting large portions due to freezing, before index only scans work and most vacuums act in a partial manner the migration to 9.6 will be a major pain for our users. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On Mon, Jul 13, 2015 at 7:46 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Jul 13, 2015 at 3:39 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Tue, Jul 7, 2015 at 9:07 PM, Simon Riggs si...@2ndquadrant.com wrote: On 6 July 2015 at 17:28, Simon Riggs si...@2ndquadrant.com wrote: I think we need something for pg_upgrade to rewrite existing VMs. Otherwise a large read only database would suddenly require a massive revacuum after upgrade, which seems bad. That can wait for now until we all agree this patch is sound. Since we need to rewrite the vm map, I think we should call the new map vfm That way we will be able to easily check whether the rewrite has been conducted on all relations. Since the maps are just bits there is no other way to tell that a map has been rewritten To avoid revacuum after upgrade, you meant that we need to rewrite each bit of vm to corresponding bits of vfm, if it's from not-supporting vfm version(i.g., 9.5 or earlier ). right? If so, we will need to do whole scanning table, which is expensive as well. Clearing vm and do revacuum would be nice, rather than doing in upgrading, I think. How will you ensure to have revacuum for all the tables after upgrading? We use script file which are generated by pg_upgrade. Till the time Vacuum is done on the tables that have vm before upgrade, any queries on those tables can become slower. Even If we implement rewriting tool for vm into pg_upgrade, it will take time as much as revacuum because it need whole scanning table. I meant that we rewrite vm using by existing facility (i.g., vacuum (freeze)), instead of implementing new rewriting tool for vm. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On 07/13/2015 01:36 PM, Amit Kapila wrote: I have already proposed something very similar in this thread [1] (where instead of class, I have used wait_event_type) to which Robert doesn't agree, so here I think before writing code, it seems prudent to get an agreement about what kind of User-Interface would satisfy the requirement and will be extendible for future as well. I think it will be better if you can highlight some points about what kind of user-interface is better (extendible) and the reasons for same. [1] (Refer option-3) - http://www.postgresql.org/message-id/caa4ek1j6cg_jym00nrwt4n8r78zn4ljoqy_zu1xrzxfq+me...@mail.gmail.com The idea of splitting to classes and events does not confict with your current implementation. That is not an issue to show only one value in pg_stat_activity and more detailed two parameters in other places. The base reason is that DBA will want to see grouped information about class (for example wait time of whole `Storage` class). About user interface it depends from what we want to be monitored. In our patch we have profiling and history. In profiling we show class, event, wait_time and count. In history we save all parameters of wait. Other problem of pg_stat_activity that we can not see all processes there (checkpointer for example). So we anyway need separate view for monitoring purposes. -- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On Mon, Jul 13, 2015 at 06:19:49PM -0400, Andrew Dunstan wrote: On 7/13/2015 5:36 PM, Andrew Dunstan wrote: hstore_plpython.o: In function `hstore_to_plpython': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35: undefined reference to `PLyUnicode_FromStringAndSize' The functions are in fact apparently built - the names are present in the object file and the DLL. Per objdump -x src/pl/plpython/plpython3.dll | less -pOrdinal/Name, those symbols aren't exported. The Cygwin toolchain appears to export every function until you mark one with __declspec (dllexport), after which it exports just the ones so marked. Since plpython3.dll has an explicit __declspec on PyInit_plpy(), that's the sole function exported. Adding -Wl,--export-all-symbols to the PL/Python link lets the build complete; I have not researched whether it is a proper fix. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forensic recovery deleted pgdump custom format file
On Tue, Jul 14, 2015 at 11:20 AM, David Guimaraes skys...@gmail.com wrote: Yeah bingo Hm. While there is a magic-code header for the custom format, by looking at the code I am not seeing any traces of a similar thing at the end of the dump file (_CloseArchive in pg_backup_custom.c), and I don't recall wither that there is an estimation of the size of the dump either in the header. If those files were stored close to each other, one idea may be to look for the next header present. or to attempt to roughly estimate the size that they would have I am afraid. In any case, applying reverse engineering methods seems like the most reliable method to reconstitute an archive handler that could be used by pg_restore or pg_dump, but perhaps others have other ideas. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generalized JSON output functions
On 07/13/2015 05:41 AM, Shulgin, Oleksandr wrote: On Mon, Jul 13, 2015 at 9:44 AM, Pavel Stehule pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com wrote: To reiterate: for my problem, that is escaping numerics that can potentially overflow[1] under ECMAScript standard, I want to be able to override the code that outputs the numeric converted to string. There is no way in current implementation to do that *at all*, short of copying all the code involved in producing JSON output and changing it at certain points. One could try re-parsing JSON instead, but that doesn't actually solve the issue, because type information is lost forever at that point. The whitespace unification was a mere side-effect of the original effort on this patch. The dynamic type change is some what I would not to do in database, really :) If you afraid about overflow, then convert numeric to string immediately - in this case, the client have to support both variant - so immediate cast should not be a problem. Yeah, but how would you do that in context of a logical replication decoding plugin? I've tried a number of tricks for that, including, but not limited to registering phony types to wrap numeric type and replacing the OID of numeric with this custom type OID in TupleDesc, but then again one has to register that as known record type, etc. Anyway this check on max number should be implemented in our JSON(b) out functions (as warning?). Not really, since this is a problem of ECMAScript standard, not JSON spec. For example, Python module for handling JSON doesn't suffer from this overflow problem, The thing is, we cannot know which clients are going to consume the stream of decoded events, and if it's some implementation of JavaScript, it can suffer silent data corruption if we don't guard against that in the logical decoding plugin. Hope that makes it clear. :-) Yes, but I think the plugin is the right place to do it. What is more, this won't actually prevent you completely from producing non-ECMAScript compliant JSON, since json or jsonb values containing offending numerics won't be caught, AIUI. But a fairly simple to write function that reparsed and fixed the JSON inside the decoder would work. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] dead assignment src/bin/scripts/print.c line 421
Friendly greetings ! in file src/bin/scripts/print.c line 421 : need_recordsep = false; then set to true line 424. Now i'm pretty sure it's a meaningless bug without any consequence (the commit that introduced it is 15 years old). There is a lot of (apparently) dead assignment here and there but some assigment could be used for debugging purpose so ... why not. But this one ? -- Laurent ker2x Laborde DBA gandi.net \o/
Re: [HACKERS] Freeze avoidance of very large table.
On Mon, Jul 13, 2015 at 9:22 PM, Andres Freund and...@anarazel.de wrote: On 2015-07-13 21:03:07 +0900, Sawada Masahiko wrote: Even If we implement rewriting tool for vm into pg_upgrade, it will take time as much as revacuum because it need whole scanning table. Why would it? Sure, you can only set allvisible and not the frozen bit, but that's fine. That way the cost for freezing can be paid over time. If we require terrabytes of data to be scanned, including possibly rewriting large portions due to freezing, before index only scans work and most vacuums act in a partial manner the migration to 9.6 will be a major pain for our users. Ah, If we set all bit as not all-frozen, we don't need to whole table scanning, only scan vm. And I agree with this. But please image the case where old cluster has table which is very large, read-only and vacuum freeze is done. In this case, the all-frozen bit of such table in new cluster will not set, unless we do vacuum freeze again. The information of all-frozen of such table is lacked. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generalized JSON output functions
On 07/13/2015 10:46 AM, Ryan Pedela wrote: On Mon, Jul 13, 2015 at 1:30 AM, Shulgin, Oleksandr oleksandr.shul...@zalando.de mailto:oleksandr.shul...@zalando.de wrote: To reiterate: for my problem, that is escaping numerics that can potentially overflow[1] under ECMAScript standard, I want to be able to override the code that outputs the numeric converted to string. There is no way in current implementation to do that *at all*, short of copying all the code involved in producing JSON output and changing it at certain points. One could try re-parsing JSON instead, but that doesn't actually solve the issue, because type information is lost forever at that point. I had the exact same problem with Node.js and client-side Javascript. That is why I wrote json-bignum [1] for Node.js. There is a bower version [2] as well. The only caveat is that it is slower than the native JSON functions, but I am happy to receive PRs to improve performance. 1. https://github.com/datalanche/json-bignum 2. https://libraries.io/bower/json-bignum As far as large numbers in JSON, I think Postgres is doing the right thing and should not be changed. It is Javascript that is stupid here, and I don't think it is wise to add something to core just because one client does stupid things with large numbers. In addition, ES7 is introducing value types which will hopefully solve the large number problem in Javascript. The random whitespace issue is valid in my opinion and should be fixed. OK, I think we're getting a consensus here. It's good to know the JS world is acquiring some sanity in this area. Let's just fix the whitespace and be done, without all the callback stuff. That should be a much smaller patch. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
On Fri, Jul 10, 2015 at 10:06 PM, Beena Emerson memissemer...@gmail.com wrote: Hello, Tue, Jul 7, 2015 at 02:56 AM, Josh Berkus wrote: pro-JSON: * standard syntax which is recognizable to sysadmins and devops. * can use JSON/JSONB functions with ALTER SYSTEM SET to easily make additions/deletions from the synch rep config. * can add group labels (see below) Adding group labels do have a lot of values but as Amit has pointed out, with little modification, they can be included in GUC as well. Or you can extend the custom GUC mechanism so that we can specify the groups by using them, for example, quorum_commit.mygroup1 = 'london, nyc' quorum_commit.mygruop2 = 'tokyo, pune' synchronous_standby_names = '1(mygroup1), 1(mygroup2)' On Tue, Jul 7, 2015 at 2:19 PM, Michael Paquier wrote: Something like pg_syncinfo/ coupled with a LW lock, we already do something similar for replication slots with pg_replslot/. I was trying to figure out how the JSON metadata can be used. It would have to be set using a given set of functions. So we can use only such a set of functions to configure synch rep? I don't like that idea. Because it prevents us from configuring that while the server is not running. Is JSON being preferred because it would be ALTER SYSTEM friendly and in a format already known to users? At least currently ALTER SYSTEM cannot accept the JSON data (e.g., the return value of JSON function like json_build_object()) as the setting value. So I'm not sure how friendly ALTER SYSTEM and JSON format really. If you want to argue that, probably you need to improve ALTER SYSTEM so that JSON can be specified. In a real-life scenario, at most how many groups and nesting would be expected? I don't think that many groups and nestings are common. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generalized JSON output functions
On Mon, Jul 13, 2015 at 1:30 AM, Shulgin, Oleksandr oleksandr.shul...@zalando.de wrote: To reiterate: for my problem, that is escaping numerics that can potentially overflow[1] under ECMAScript standard, I want to be able to override the code that outputs the numeric converted to string. There is no way in current implementation to do that *at all*, short of copying all the code involved in producing JSON output and changing it at certain points. One could try re-parsing JSON instead, but that doesn't actually solve the issue, because type information is lost forever at that point. I had the exact same problem with Node.js and client-side Javascript. That is why I wrote json-bignum [1] for Node.js. There is a bower version [2] as well. The only caveat is that it is slower than the native JSON functions, but I am happy to receive PRs to improve performance. 1. https://github.com/datalanche/json-bignum 2. https://libraries.io/bower/json-bignum As far as large numbers in JSON, I think Postgres is doing the right thing and should not be changed. It is Javascript that is stupid here, and I don't think it is wise to add something to core just because one client does stupid things with large numbers. In addition, ES7 is introducing value types which will hopefully solve the large number problem in Javascript. The random whitespace issue is valid in my opinion and should be fixed. Thanks, Ryan Pedela
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On Mon, Jul 13, 2015 at 9:19 PM, Ildus Kurbangaliev i.kurbangal...@postgrespro.ru wrote: On 07/13/2015 01:36 PM, Amit Kapila wrote: Other problem of pg_stat_activity that we can not see all processes there (checkpointer for example). So we anyway need separate view for monitoring purposes. +1 When there are many walsender processes running, maybe I'd like to see their wait events. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Default Roles (was: Additional role attributes)
On Wed, May 13, 2015 at 12:07 PM, Stephen Frost sfr...@snowman.net wrote: All, This patch gets smaller and smaller. Upon reflection I realized that, with default roles, it's entirely unnecssary to change how the permission checks happen today- we can simply add checks to them to be looking at role membership also. That's removed the last of my concerns regarding any API breakage for existing use-cases and has greatly simplified things overall. This does change the XLOG functions to require pg_monitor, as discussed on the other thread where it was pointed out by Heikki that the XLOG location information could be used to extract sensitive information based on what happens during compression. Adding docs explaining that is on my to-do list for tomorrow. * Stephen Frost (sfr...@snowman.net) wrote: Andres suggested that we drop the REPLICATION role attribute and just use membership in pg_replication instead. That's turned out quite fantastically as we can now handle upgrades without breaking anything- CREATE ROLE and ALTER ROLE still accept the attribute but simply grant pg_replication to the role instead, and postinit.c has been changed to check role membership similar to other pg_hba role membership checks when a replication connection comes in. Hat's off to Andres for his suggestion. It's also unnecessary to change how the REPLICATION role attribute functions today. This patch does add the pg_replication role, but it's only allowed to execute the various pg_logical and friends functions and not to actually connect as a REPLICATION user. Connecting as a REPLICATION user allows you to stream the entire contents of the backend, after all, so it makes sense to have that be independent. I added another default role which allows the user to view pg_show_file_settings, as that seemed useful to me. The diffstat for that being something like 4 additions without docs and maybe 10 with. More documentation would probably be good though and I'll look at adding to it. Most of the rest of what I've done has simply been reverting back to what we had. The patch is certainly far easier to verify by reading through it now, as the changes are right next to each other, and the regression output changes are much smaller. Thoughts? Comments? Suggestions? he documents of the functions which the corresponding default roles are added by this patch need to be updated. For example, the description of pg_xlog_replay_pause() says Pauses recovery immediately (restricted to superusers).. I think that the description needs to mention the corresponding default role pg_replay. Otherwise, it's difficult for users to see which default role is related to the function they want to use. Or probably we can add the table explaining all the relationships between default roles and corresponding operations. And it's useful. Why do we allow users to change the attributes of default roles? For example, ALTER ROLE default_role or GRANT ... TO default_role. Those changes are not dumped by pg_dumpall. So if users change the attributes for some reasons but they disappear via pg_dumpall, maybe the system goes into unexpected state. I think that it's better to allow the roles with pg_monitor to execute pgstattuple functions. They are usually used for monitoring. Thought? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape
Michael Paquier michael.paqu...@gmail.com writes: Regarding the fact that those two contrib modules can be part of a -contrib package and could be installed, nuking those two extensions from the tree and preventing the creating of custom tablesample methods looks like a correct course of things to do for 9.5. TBH, I think the right thing to do at this point is to revert the entire patch and send it back for ground-up rework. I think the high-level design is wrong in many ways and I have about zero confidence in most of the code details as well. I'll send a separate message about high-level issues, but as far as code details go, I started to do some detailed code review last night and only got through contrib/tsm_system_rows/tsm_system_rows.c before deciding it was hopeless. Let's have a look at my notes: * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group Obsolete copyright date. * IDENTIFICATION *contrib/tsm_system_rows_rowlimit/tsm_system_rows.c Wrong filename. (For the moment, I'll refrain from any value judgements about the overall adequacy or quality of the comments in this patch, and just point out obvious errors that should have been caught in review.) typedef struct { SamplerRandomState randstate; uint32 seed; /* random seed */ BlockNumber nblocks;/* number of block in relation */ int32 ntuples;/* number of tuples to return */ int32 donetuples; /* tuples already returned */ OffsetNumber lt;/* last tuple returned from current block */ BlockNumber step; /* step size */ BlockNumber lb; /* last block visited */ BlockNumber doneblocks; /* number of already returned blocks */ } SystemSamplerData; This same typedef name is defined in three different places in the patch (tablesample/system.c, tsm_system_rows.c, tsm_system_time.c). While that might not amount to a bug, it's sure a recipe for confusion, especially since the struct definitions are all different. Datum tsm_system_rows_init(PG_FUNCTION_ARGS) { TableSampleDesc *tsdesc = (TableSampleDesc *) PG_GETARG_POINTER(0); uint32 seed = PG_GETARG_UINT32(1); int32 ntuples = PG_ARGISNULL(2) ? -1 : PG_GETARG_INT32(2); This is rather curious coding. Why should this function check only one of its arguments for nullness? If it needs to defend against any of them being null, it really needs to check all. But in point of fact, this function is declared STRICT, which means it's a violation of the function call protocol if the core code ever passes a null to it, and so this test ought to be dead code. A similar pattern of ARGISNULL checks in declared-strict functions exists in all the tablesample modules, not just this one, showing that this is an overall design error not just a thinko here. My inclination would be to make the core code enforce non-nullness of all tablesample arguments so as to make it unnecessary to check strictness of the tsm*init functions, but in any case it is Not Okay to just pass nulls willy-nilly to strict C functions. Also, I find this coding pretty sloppy even without the strictness angle, because the net effect of this way of dealing with nulls is that an argument-must-not-be-null complaint is reported as out of range, which is both confusing and the wrong ERRCODE. if (ntuples 1) ereport(ERROR, (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg(invalid sample size), errhint(Sample size must be positive integer value.))); I don't find this to be good error message style. The secondary comment is not a hint, it's an ironclad statement of what you did wrong, so if we wanted to phrase it like this it should be an errdetail not errhint. But the whole thing is overly cute anyway because there is no reason at all not to just say what we mean in the primary error message, eg ereport(ERROR, (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg(sample size must be greater than zero))); While we're on the subject, what's the reason for disallowing a sample size of zero? Seems like a reasonable edge case. /* All blocks have been read, we're done */ if (sampler-doneblocks sampler-nblocks || sampler-donetuples = sampler-ntuples) PG_RETURN_UINT32(InvalidBlockNumber); Okay, I lied, I *am* going to complain about this comment. Comments that do not accurately describe the code they're attached to are worse than useless. /* * Cleanup method. */ Datum tsm_system_rows_end(PG_FUNCTION_ARGS) { TableSampleDesc *tsdesc = (TableSampleDesc *) PG_GETARG_POINTER(0); pfree(tsdesc-tsmdata); PG_RETURN_VOID(); } This cleanup method is a waste of code space. There is no need to pfree individual allocations at query execution end. limitnode = linitial(args); limitnode = estimate_expression_value(root, limitnode);
Re: [HACKERS] Freeze avoidance of very large table.
On 13 July 2015 at 15:48, Sawada Masahiko sawada.m...@gmail.com wrote: On Mon, Jul 13, 2015 at 9:22 PM, Andres Freund and...@anarazel.de wrote: On 2015-07-13 21:03:07 +0900, Sawada Masahiko wrote: Even If we implement rewriting tool for vm into pg_upgrade, it will take time as much as revacuum because it need whole scanning table. Why would it? Sure, you can only set allvisible and not the frozen bit, but that's fine. That way the cost for freezing can be paid over time. If we require terrabytes of data to be scanned, including possibly rewriting large portions due to freezing, before index only scans work and most vacuums act in a partial manner the migration to 9.6 will be a major pain for our users. Ah, If we set all bit as not all-frozen, we don't need to whole table scanning, only scan vm. And I agree with this. But please image the case where old cluster has table which is very large, read-only and vacuum freeze is done. In this case, the all-frozen bit of such table in new cluster will not set, unless we do vacuum freeze again. The information of all-frozen of such table is lacked. The contents of the VM fork is essential to retain after an upgrade because it is used for Index Only Scans. If we destroy that information it could send SQL response times to unacceptable levels after upgrade. It takes time to scan the VM and create the new VFM, but the time taken is proportional to the size of VM, which seems like it will be acceptable. Example calcs: An 8TB PostgreSQL installation would need us to scan 128MB of VM into about 256MB of VFM. Probably the fsyncs will occupy the most time. In comparison, we would need to scan all 8TB to rebuild the VMs, which will take much longer (and fsyncs will still be needed). Since we don't record freeze map information now it is acceptable to begin after upgrade with all freeze info set to zero. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Forensic recovery deleted pgdump custom format file
The backups were deleted. I need them to use pg_restore. Em 13/07/2015 21:18, Michael Paquier michael.paqu...@gmail.com escreveu: On Tue, Jul 14, 2015 at 9:28 AM, David Guimaraes skys...@gmail.com wrote: So I decided to try to understand the file format generated by pgdump. Analyzing the source code of pgdump/recovery, i concluded a few things: The header of the file always starts with PGDMP followed by pgdump version number used, followed by int size, offset, etc. followed by TOCs. My question is how to know the end of the file? Are there any signature that I can use? Or would have to analyze the whole file? Why are you trying to reinvent the wheel? pg_restore is not available? -- Michael
Re: [HACKERS] Forensic recovery deleted pgdump custom format file
On Tue, Jul 14, 2015 at 10:58 AM, David Guimaraes skys...@gmail.com wrote: The backups were deleted. I need them to use pg_restore. So what you mean is that you are looking at your disk at FS level to find traces of those deleted backups by analyzing their binary format... Am I missing something? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forensic recovery deleted pgdump custom format file
Yeah bingo Em 13/07/2015 22:11, Michael Paquier michael.paqu...@gmail.com escreveu: On Tue, Jul 14, 2015 at 10:58 AM, David Guimaraes skys...@gmail.com wrote: The backups were deleted. I need them to use pg_restore. So what you mean is that you are looking at your disk at FS level to find traces of those deleted backups by analyzing their binary format... Am I missing something? -- Michael
Re: [HACKERS] Default Roles (was: Additional role attributes)
Fujii, * Fujii Masao (masao.fu...@gmail.com) wrote: he documents of the functions which the corresponding default roles are added by this patch need to be updated. For example, the description of pg_xlog_replay_pause() says Pauses recovery immediately (restricted to superusers).. I think that the description needs to mention the corresponding default role pg_replay. Otherwise, it's difficult for users to see which default role is related to the function they want to use. Or probably we can add the table explaining all the relationships between default roles and corresponding operations. And it's useful. Certainly, totally agree that we need to make it clear in the function descriptions also. Why do we allow users to change the attributes of default roles? For example, ALTER ROLE default_role or GRANT ... TO default_role. Those changes are not dumped by pg_dumpall. So if users change the attributes for some reasons but they disappear via pg_dumpall, maybe the system goes into unexpected state. Good point. I'm fine with simply disallowing that completely; does anyone want to argue that we should allow superusers to ALTER or GRANT to these roles? I have a hard time seeing the need for that and it could make things quite ugly. I think that it's better to allow the roles with pg_monitor to execute pgstattuple functions. They are usually used for monitoring. Thought? Possibly, but I'd need to look at them more closely than I have time to right now. Can you provide a use-case? That would certainly help. Also, we are mostly focused on things which are currently superuser-only capabilities, if you don't need to be superuser today then the monitoring system could be granted access using the normal mechanisms. Actually logging systems won't log in directly as pg_monitor anyway, they'll log in as nagios or similar, which has been GRANT'd pg_monitor and could certainly be GRANT'd other rights also. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] A little RLS oversight?
Michael, * Michael Paquier (michael.paqu...@gmail.com) wrote: On Sun, Jul 12, 2015 at 5:59 PM, Yaroslav wrote: I can still see all statistics for 'test' in pg_stats under unprivileged user. Indeed, this looks like an oversight of RLS. Even if a policy is defined to prevent a user from seeing the rows of other users, it is still possible to get some information though this view. I am adding an open item regarding that for 9.5. We need to be careful to avoid the slippery slope of trying to prevent all covert channels, which has been extensively discussed previously. I tend to agree with this specific case of, if you have RLS configured on the table then we probably shouldn't allow normal users to see the stats on the table, but I don't have a problem with the usage of those stats for generating plans, which users could see the results of via EXPLAIN. I'd prefer statistics on RLS-enabled tables to be simply hidden completely for unprivileged users. This looks like something simple enough to do. @Stephen: perhaps you have some thoughts on the matter? Currently pg_stats breaks its promise to only show information about the rows current user can read. I agree that it should be reasonably simple to do and, provided that's the case, I'm fine with doing it once I get back (currently out until the 27th). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade + Extensions
On 07/13/2015 01:12 PM, Smitha Pamujula wrote: Yes. I have checked that the extension didn't exist in any of the databases. I used \dx to see if there was json_build was listed and i didnt see any. Is that sufficient to check its existence. I am about to do another testing in a few minutes on a different machine. I will capture before/after shots Please don't top-post on the PostgreSQL lists - see http://idallen.com/topposting.html In theory it should be enough if it was installed in the standard way. But a more thorough procedure would be to run this command: select count(*) from pg_proc where prosrc ~ 'json_build'; Here's a one-liner that will check every database for you: for db in `psql -t -c 'select datname from pg_database where datallowconn'` ; do C=`psql -t -c select count(*) from pg_proc where prosrc ~ 'json_build' $db`; echo $db $C; done cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape
On 11 July 2015 at 21:28, Tom Lane t...@sss.pgh.pa.us wrote: What are we going to do about this? I will address the points you raise, one by one. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] pg_upgrade + Extensions
Yes. I have checked that the extension didn't exist in any of the databases. I used \dx to see if there was json_build was listed and i didnt see any. Is that sufficient to check its existence. I am about to do another testing in a few minutes on a different machine. I will capture before/after shots Thanks On Fri, Jul 10, 2015 at 4:26 PM, Andrew Dunstan and...@dunslane.net wrote: On Fri, Jul 10, 2015 at 5:05 PM, David E. Wheeler da...@justatheory.com wrote: On Jul 10, 2015, at 11:32 AM, Smitha Pamujula smitha.pamuj...@iovation.com wrote: Your installation references loadable libraries that are missing from the new installation. You can add these libraries to the new installation, or remove the functions using them from the old installation. A list of problem libraries is in the file: loadable_libraries.txt Failure, exiting [postgres@pdxdvrptsrd04 ~]$ cat loadable_libraries.txt Could not load library json_build ERROR: could not access file json_build: No such file or directory So you drop the json_build extension before upgrading, but pg_upgrade still complains that it’s missing? That seems odd. Are you sure the extension was uninstalled from every database in the cluster? This seems likely to occur when you forgot to uninstall it from some database (e.g. template1) cheers andrew -- Smitha Pamujula Database Administrator // The Watch Woman Direct: 503.943.6764 Mobile: 503.290.6214 // Twitter: iovation www.iovation.com
Re: [HACKERS] [PATCH] SQL function to report log message
On Mon, Jul 13, 2015 at 1:11 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Jul 13, 2015 at 4:54 PM, dinesh kumar dineshkuma...@gmail.com wrote: Would like to discuss on below feature here. Feature: Having an SQL function, to write messages to log destination. Justification: As of now, we don't have an SQL function to write custom/application messages to log destination. We have RAISE clause which is controlled by log_ parameters. If we have an SQL function which works irrespective of log settings, that would be a good for many log parsers. What i mean is, in DBA point of view, if we route all our native OS stats to log files in a proper format, then we can have our log reporting tools to give most effective reports. Also, Applications can log their own messages to postgres log files, which can be monitored by DBAs too. What's the actual use case for this feature other than it would be good to have it? That's a good question Michael. When i was working as a DBA for a different RDBMS, developers used to write some serious APP errors, Followed by instructions into some sort of log and trace files. Since, DBAs didn't have the permission to check app logs, which was owned by Ops team. In my old case, application was having serious OOM issues, which was crashing frequently after the deployment. It wasn't the consistent behavior from the app side, hence they used to sent a copy all APP metrics to trace files, and we were monitoring the DB what was happening during the spike on app servers. I didn't mean that, we need to have this feature, since we have it on other RDBMS. I don't see a reason, why don't we have this in our PG too. I see the similar item in our development list http://www.postgresql.org/message-id/53a8e96e.9060...@2ndquadrant.com. Let me know if i miss anything here. Best Regards, Dinesh manojadinesh.blogspot.com A log message is here to give information about the state of something that happens in backend, but in the case of this function the event that happens is the content of the function itself. It also adds a new log level for something that has a unique usage, which looks like an overkill to me. Btw, you could do something more advanced with simply an extension if you really want to play with this area... But I am dubious about what kind of applications would use that. -- Michael
Re: [HACKERS] dead assignment src/bin/scripts/print.c line 421
On 07/13/2015 04:56 PM, Laurent Laborde wrote: Friendly greetings ! in file src/bin/scripts/print.c line 421 : need_recordsep = false; then set to true line 424. Now i'm pretty sure it's a meaningless bug without any consequence (the commit that introduced it is 15 years old). There is a lot of (apparently) dead assignment here and there but some assigment could be used for debugging purpose so ... why not. But this one ? The code in question looks like this: for (f = footers; f; f = f-next) { if (need_recordsep) { print_separator(cont-opt-recordSep, fout); need_recordsep = false; } fputs(f-data, fout); need_recordsep = true; } Hmm. It does kind of make sense. Right after printing the separator, you don't need to print a separator because you just printed one. But as soon as you print the field, you need a separator again. It would be quite understandable without that dead assignment too, and that's probably how I would've written it in the first place. But since that's how it is and has been for 15 years, I'm inclined to just leave it so. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 7/13/2015 5:36 PM, Andrew Dunstan wrote: On 07/12/2015 05:06 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: On 07/04/2015 11:02 AM, Tom Lane wrote: It's not apparent to me how that works at all. BTW, the .a files being linked to above are not like Unix .a static archives - they are import library files, which I think they are only used at link time, not run time. The path to the DLLs isn't being hardcoded. Ah, I see. So then what Marco is suggesting is copying a coding pattern that does work. Unless there is further argument I propose to commit something very like Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython No objection here. OK, I tried the attached patch. but when trying to build with python 3 I get this (no such problems with python2, which builds and tests fine): make -C hstore_plpython all make[1]: Entering directory '/home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython' ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I../../src/pl/plpython -I/usr/include/python3.4m -I../../contrib/hstore -I. -I. -I../../src/include -I/usr/include/libxml2 -c -o hstore_plpython.o hstore_plpython.c ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -shared -o hstore_plpython3.dll hstore_plpython.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -Wl,--enable-auto-import -L/usr/lib -L/usr/local/lib -Wl,--as-needed -L../../src/backend -lpostgres -L../hstore -lhstore -L../../src/pl/plpython -lplpython3 -L/usr/lib/python3.4/config-3.4m -lpython3.4m -lpgcommon -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -lcrypt hstore_plpython.o: In function `hstore_to_plpython': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35: undefined reference to `PLyUnicode_FromStringAndSize' [cut] I'd like to get that fixed before applying anything. Marco, any ideas? To build with python 3, set the environment like this: PYTHON=/usr/bin/python3 - this can be done in the config file. I am only building with python2, but on cygwin there is an additional intl library for python3 binding $ python2-config --libs -lpython2.7 -ldl $ python3-config --libs -lpython3.4m -lintl -ldl cheers andrew Regards Marco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On 2015-07-13 23:48:02 +0900, Sawada Masahiko wrote: But please image the case where old cluster has table which is very large, read-only and vacuum freeze is done. In this case, the all-frozen bit of such table in new cluster will not set, unless we do vacuum freeze again. The information of all-frozen of such table is lacked. So what? That's the situation today… Yes, it'll trigger a anti-wraparound vacuum at some later point, after that they map bits will be set. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dead assignment src/bin/scripts/print.c line 421
Should have been sent to the bugs ML sorry :-/ On Mon, Jul 13, 2015 at 3:56 PM, Laurent Laborde kerdez...@gmail.com wrote: Friendly greetings ! in file src/bin/scripts/print.c line 421 : need_recordsep = false; then set to true line 424. Now i'm pretty sure it's a meaningless bug without any consequence (the commit that introduced it is 15 years old). There is a lot of (apparently) dead assignment here and there but some assigment could be used for debugging purpose so ... why not. But this one ? -- Laurent ker2x Laborde DBA gandi.net \o/ -- Laurent ker2x Laborde
Re: [HACKERS] [BUGS] BUG #13126: table constraint loses its comment
On 07/08/2015 08:12 AM, Michael Paquier wrote: On Tue, Jul 7, 2015 at 6:24 PM, Petr Jelinek p...@2ndquadrant.com wrote: On 2015-07-04 13:45, Michael Paquier wrote: On Fri, Jul 3, 2015 at 11:59 PM, Petr Jelinek wrote: Well for indexes you don't really need to add the new AT command, as IndexStmt has char *idxcomment which it will automatically uses as comment if not NULL. While I am not huge fan of the idxcomment it doesn't seem to be easy to remove it in the future and it's what transformTableLikeClause uses so it might be good to be consistent with that. Oh, right, I completely missed your point and this field in IndexStmt. Yes it is better to be consistent in this case and to use it. It makes as well the code easier to follow. Btw, regarding the new AT routines, I am getting find of them, it makes easier to follow which command is added where in the command queues. Updated patch attached. Cool, I am happy with the patch now. Marking as ready for committer. Thanks for the review. I don't much like splitting the code across multiple helper functions, it makes the overall logic more difficult to follow, although I agree that the indentation has made the pretty hard to read as it is. I'm planning to commit the attached two patches. The first one is just reformatting changes to ATExecAlterColumnType(), turning the switch-case statements into if-else blocks. If-else doesn't cause so much indentation, and allows defining variables local to the cases more naturally, so it alleviates the indentation problem somewhat. The second patch is the actual bug fix. There was one bug in this patch: the COMMENT statement that was constructed didn't schema-qualify the relation, so if the ALTERed table was not in search_path, the operation would fail with a relation not found error (or add the comment to wrong object). Fixed that. I plan to commit the attached patches later today or tomorrow. But how do you feel about back-patching this? It should be possible to backpatch, although at a quick test it seems that there have been small changes to the affected code in many versions, so it would require some work. Also, in back-branches we'd need to put the new AT_ReAddComment type to the end of the list, like we've done when we added AT_ReAddConstraint in 9.3. I'm inclined to backpatch this to 9.5 only, even though this is clearly a bug fix, on the grounds that it's not a very serious bug and there's always some risk in backpatching. - Heikki From c4865eb873a9cafb7e247cc69b7030243b74f3be Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas heikki.linnakan...@iki.fi Date: Mon, 13 Jul 2015 19:22:31 +0300 Subject: [PATCH 1/2] Reformat code in ATPostAlterTypeParse. The code in ATPostAlterTypeParse was very deeply indented, mostly because there were two nested switch-case statements, which add a lot of indentation. Use if-else blocks instead, to make the code less indented and more readable. This is in preparation for next patch that makes some actualy changes to the function. These cosmetic parts have been separated to make it easier to see the real changes in the other patch. --- src/backend/commands/tablecmds.c | 104 +++ 1 file changed, 51 insertions(+), 53 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index d394713..e7b23f1 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -8645,69 +8645,67 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, Node *stm = (Node *) lfirst(list_item); AlteredTableInfo *tab; - switch (nodeTag(stm)) + tab = ATGetQueueEntry(wqueue, rel); + + if (IsA(stm, IndexStmt)) + { + IndexStmt *stmt = (IndexStmt *) stm; + AlterTableCmd *newcmd; + + if (!rewrite) +TryReuseIndex(oldId, stmt); + + newcmd = makeNode(AlterTableCmd); + newcmd-subtype = AT_ReAddIndex; + newcmd-def = (Node *) stmt; + tab-subcmds[AT_PASS_OLD_INDEX] = +lappend(tab-subcmds[AT_PASS_OLD_INDEX], newcmd); + } + else if (IsA(stm, AlterTableStmt)) { - case T_IndexStmt: + AlterTableStmt *stmt = (AlterTableStmt *) stm; + ListCell *lcmd; + + foreach(lcmd, stmt-cmds) + { +AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd); + +if (cmd-subtype == AT_AddIndex) { - IndexStmt *stmt = (IndexStmt *) stm; - AlterTableCmd *newcmd; + Assert(IsA(cmd-def, IndexStmt)); if (!rewrite) - TryReuseIndex(oldId, stmt); + TryReuseIndex(get_constraint_index(oldId), + (IndexStmt *) cmd-def); - tab = ATGetQueueEntry(wqueue, rel); - newcmd = makeNode(AlterTableCmd); - newcmd-subtype = AT_ReAddIndex; - newcmd-def = (Node *) stmt; + cmd-subtype = AT_ReAddIndex; tab-subcmds[AT_PASS_OLD_INDEX] = - lappend(tab-subcmds[AT_PASS_OLD_INDEX], newcmd); - break; + lappend(tab-subcmds[AT_PASS_OLD_INDEX], cmd); } - case T_AlterTableStmt: +else
[HACKERS] [DESIGN] Incremental checksums
pgsql-hackers, So I’ve put some time into a design for the incremental checksum feature and wanted to get some feedback from the group: * Incremental Checksums PostgreSQL users should have a way up upgrading their cluster to use data checksums without having to do a costly pg_dump/pg_restore; in particular, checksums should be able to be enabled/disabled at will, with the database enforcing the logic of whether the pages considered for a given database are valid. Considered approaches for this are having additional flags to pg_upgrade to set up the new cluster to use checksums where they did not before (or optionally turning these off). This approach is a nice tool to have, but in order to be able to support this process in a manner which has the database online while the database is going throught the initial checksum process. In order to support the idea of incremental checksums, this design adds the following things: ** pg_control: Keep data_checksum_version, but have it indicate *only* the algorithm version for checksums. i.e., it's no longer used for the data_checksum enabled/disabled state. Add data_checksum_state, an enum with multiple states: disabled, enabling, enforcing (perhaps revalidating too; something to indicate that we are reprocessing a database that purports to have been completely checksummed already) An explanation of the states as well as the behavior of the checksums for each. - disabled = not in a checksum cycle; no read validation, no checksums written. This is the current behavior for Postgres *without* checksums. - enabling = in a checksum cycle; no read validation, write checksums. Any page that gets written to disk will be a valid checksum. This is required when transitioning a cluster which has never had checksums, as the page reads would normally fail since they are uninitialized. - enforcing = not in a checksum cycle; read validation, write checksums. This is the current behavior of Postgres *with* checksums. (caveat: I'm not certain the following state is needed (and the current version of this patch doesn't have it)): - revalidating = in a checksum cycle; read validation, write checksums. The difference between this and enabling is that we care if page reads fail, since by definition they should have been validly checksummed, as we should verify this. Add data_checksum_cycle, a counter that gets incremented with every checksum cycle change. This is used as a flag to verify when new checksum actions take place, for instance if we wanted to upgrade/change the checksum algorithm, or if we just want to support periodic checksum validation. This variable will be compared against new values in the system tables to keep track of which relations still need to be checksummed in the cluster. ** pg_database: Add a field datlastchecksum which will be the last checksum cycle which has completed for all relations in that database. ** pg_class: Add a field rellastchecksum which stores the last successful checksum cycle for each relation. ** The checksum bgworker: Something needs to proactively checksum any relations which are needing to be validated, and this something is known as the checksum bgworker. Checksum bgworker will operate similar to autovacuum daemons, and in fact in this initial pass, we'll hook into the autovac launcher due to similarities in catalog reading functionality as well as balancing out with other maintenance activity. If autovacuum does not need to do any vacuuming work, it will check if the cluster has requested a checksum cycle by checking if the state is enabling (or revalidate). If so, it will look for any database which needs checksums update. It checks the current value of the data_checksum_cycle counter and looks for any databases with datlastchecksum data_checksum_cycle. When all database have datlastchecksum == data_checksum_cycle, we initiate checksumming of any global cluster heap files. When the global cluster tables heap files have been checksummed, then we consider the checksum cycle complete, change pg_control's data_checksum_state to enforcing and consider things fully up-to-date. If it finds a database needing work, it iterates through that database's relations looking for rellastchecksum data_checksum_cycle. If it finds none (i.e., every record has rellastchecksum == data_checksum_cycle) then it marks the containing database as up-to-date by updating datlastchecksum = data_checksum_cycle. For any relation that it finds in the database which is not checksummed, it starts an actual worker to handle the checksum process for this table. Since the state of the cluster is already either enforcing or revalidating, any block writes will get checksums added automatically, so the only thing the bgworker needs to do is load each block in the relation and explicitly mark as dirty (unless that's not required for FlushBuffer() to do its thing). After every block
Re: [HACKERS] [PATCH] SQL function to report log message
On 7/13/15 3:39 PM, dinesh kumar wrote: I don't really see the point of what you're describing here. Just do something like RAISE WARNING which should normally be high enough to make it into the logs. Or use a pl language that will let you write your own logfile on the server (ie: plperlu). True. Using plperlu, shall we bypass our log_* settings. If it's true, i wasn't sure about it. plperlu can do anything the server can do. Including fun things like appending to any file the server can write to or executing things like `rm -rf pg_xlog`. I didn't mean that, we need to have this feature, since we have it on other RDBMS. I don't see a reason, why don't we have this in our PG too. I see the similar item in our development list http://www.postgresql.org/message-id/53a8e96e.9060...@2ndquadrant.com. That's not at all what that item is talking about. It's talking about exposing ereport as a SQL function, without altering the rest of our logging behavior. Ah. It's' my bad interpretation. Let me work on it, and will send a new patch as a wrapper sql function for ereport. You might want to present a plan for that; it's not as trivial as it sounds due to how ereport works. In particular, I'd want to see (at minimum) the same functionality that plpgsql's RAISE command now provides (errdetail, errhint, etc). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL function to report log message
On Mon, Jul 13, 2015 at 1:56 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 7/13/15 3:39 PM, dinesh kumar wrote: I don't really see the point of what you're describing here. Just do something like RAISE WARNING which should normally be high enough to make it into the logs. Or use a pl language that will let you write your own logfile on the server (ie: plperlu). True. Using plperlu, shall we bypass our log_* settings. If it's true, i wasn't sure about it. plperlu can do anything the server can do. Including fun things like appending to any file the server can write to or executing things like `rm -rf pg_xlog`. Thanks Much Jim. I didn't mean that, we need to have this feature, since we have it on other RDBMS. I don't see a reason, why don't we have this in our PG too. I see the similar item in our development list http://www.postgresql.org/message-id/53a8e96e.9060...@2ndquadrant.com. That's not at all what that item is talking about. It's talking about exposing ereport as a SQL function, without altering the rest of our logging behavior. Ah. It's' my bad interpretation. Let me work on it, and will send a new patch as a wrapper sql function for ereport. You might want to present a plan for that; it's not as trivial as it sounds due to how ereport works. In particular, I'd want to see (at minimum) the same functionality that plpgsql's RAISE command now provides (errdetail, errhint, etc). Sure. Let me prepare a prototype for it, and will share with you before implementing. Best Regards, Dinesh manojadinesh.blogspot.com -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] [DESIGN] Incremental checksums
On 2015-07-13 15:50:44 -0500, Jim Nasby wrote: Another possibility is some kind of a page-level indicator of what binary format is in use on a given page. For checksums maybe a single bit would suffice (indicating that you should verify the page checksum). Another use case is using this to finally ditch all the old VACUUM FULL code in HeapTupleSatisfies*(). That's a bad idea, because that bit then'd not be protected by the checksum. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On Tue, Jul 7, 2015 at 6:28 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Please forgive me to resend this message for some too-sad misspellings. # Waiting for heavy weight locks is somewhat confusing to spell.. === Hello, At Tue, 7 Jul 2015 16:27:38 +0900, Fujii Masao masao.fu...@gmail.com wrote in CAHGQGwEJwov8YwvmbbWps3Rba6kF1yf7qL3S==Oy4D=gq9y...@mail.gmail.com Each backend reports its event when trying to take a lock. But the reported event is never reset until next event is reported. Is this OK? This means that the wait_event column keeps showing the *last* event while a backend is in idle state, for example. So, shouldn't we reset the reported event or report another one when releasing the lock? It seems so but pg_stat_activity.waiting would indicate whether the event is lasting. However, .waiting reflects only the status of heavy-weight locks. It would be quite misleading. I think that pg_stat_activity.wait_event sould be linked to .waiting then .wait_event should be restricted to heavy weight locks if the meaning of .waiting cannot not be changed. Yeah, that's clearly no good. It only makes sense to have wait_event show the most recent event if waiting tells you whether the wait is still ongoing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL function to report log message
Jim Nasby jim.na...@bluetreble.com writes: On 7/13/15 3:39 PM, dinesh kumar wrote: Ah. It's' my bad interpretation. Let me work on it, and will send a new patch as a wrapper sql function for ereport. You might want to present a plan for that; it's not as trivial as it sounds due to how ereport works. In particular, I'd want to see (at minimum) the same functionality that plpgsql's RAISE command now provides (errdetail, errhint, etc). The real question is why the existing functionality in plpgsql isn't sufficient. Somebody who wants a log from SQL function can easily write a simple plpgsql function that does exactly what they want, with no more or fewer bells-n-whistles than they need. If we try to create a SQL function that does all that, it's likely to be a mess to use, even with named arguments. I'm not necessarily against the basic idea, but I think inventing something that actually offers an increment in usability compared to the existing alternative is going to be harder than it sounds. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL function to report log message
On 7/13/15 12:39 PM, dinesh kumar wrote: As of now, we don't have an SQL function to write custom/application messages to log destination. We have RAISE clause which is controlled by log_ parameters. If we have an SQL function which works irrespective of log settings, that would be a good for many log parsers. What i mean is, in DBA point of view, if we route all our native OS stats to log files in a proper format, then we can have our log reporting tools to give most effective reports. Also, Applications can log their own messages to postgres log files, which can be monitored by DBAs too. What's the actual use case for this feature other than it would be good to have it? That's a good question Michael. When i was working as a DBA for a different RDBMS, developers used to write some serious APP errors, Followed by instructions into some sort of log and trace files. Since, DBAs didn't have the permission to check app logs, which was owned by Ops team. In my old case, application was having serious OOM issues, which was crashing frequently after the deployment. It wasn't the consistent behavior from the app side, hence they used to sent a copy all APP metrics to trace files, and we were monitoring the DB what was happening during the spike on app servers. Spewing a bunch of stuff into the postgres log doesn't seem like an improvement here. I don't really see the point of what you're describing here. Just do something like RAISE WARNING which should normally be high enough to make it into the logs. Or use a pl language that will let you write your own logfile on the server (ie: plperlu). I didn't mean that, we need to have this feature, since we have it on other RDBMS. I don't see a reason, why don't we have this in our PG too. I see the similar item in our development list http://www.postgresql.org/message-id/53a8e96e.9060...@2ndquadrant.com. That's not at all what that item is talking about. It's talking about exposing ereport as a SQL function, without altering the rest of our logging behavior. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive
On Mon, Jul 6, 2015 at 10:48 AM, Fujii Masao masao.fu...@gmail.com wrote: According to his patch, the wait events that he was thinking to add were: + typedef enum PgCondition + { + PGCOND_UNUSED= 0,/* unused */ + + /* 1 - CPU */ + PGCOND_CPU= 1,/* generic cpu operations */ + /* 11000 - CPU:PARSE */ + PGCOND_CPU_PARSE= 11000,/* pg_parse_query */ + PGCOND_CPU_PARSE_ANALYZE= 11100,/* parse_analyze */ + /* 12000 - CPU:REWRITE */ + PGCOND_CPU_REWRITE= 12000,/* pg_rewrite_query */ + /* 13000 - CPU:PLAN */ + PGCOND_CPU_PLAN= 13000,/* pg_plan_query */ + /* 14000 - CPU:EXECUTE */ + PGCOND_CPU_EXECUTE= 14000,/* PortalRun or PortalRunMulti */ [ etc. ] Sorry to say it, but I this design is a mess. Suppose we are executing a query, and during the query we execute the ILIKE operator, and within that we try to acquire a buffer content lock (say, to detoast some data). So at the outermost level our state is PGCOND_CPU_EXECUTE, and then within that we are in state PGCOND_CPU_ILIKE, and then within that we are in state PGCOND_LWLOCK_PAGE. When we exit each of the inner states, we've got to restore the proper outer state, or time will be mis-attribtued. Error handling has got to pop all of the items off the stack that were added since the PG_TRY() block started, and then push on a new state for error handling, which gets popped when the PG_TRY block finishes. Another problem is that some of these things are incredibly specific (like running the ILIKE operator) and others are extremely general (like executing the query). Why does ILIKE get a code but +(int4,int4) does not? We need some less-arbitrary way of assigning codes than what's shown here. Now, that's not to say there are no good ideas here. For example, pg_stat_activity could expose a byte of state indicating which phase of query processing is current: parse / parse analysis / rewrite / plan / execute / none. I think that'd be a fine thing to do, and I support doing it, although maybe not in the same patch as my original proposal. On the flip side, I don't support trying to expose information on the level of which C function are we currently executing? because I think there's going to be absolutely no reasonable way to make that sufficiently low-overhead, and also because I don't see any way to make it less than nightmarishly onerous from the point of view of code maintenance. We could expose some functions but not others, but that seems like a mess; I think unless and until we have a better solution, the right answer to I need to know which C function is running in each backend is that's what perf is for. In any case, I think the main point is that Itagaki-san's proposal is not really a proposal for wait events. It is a proposal to expose some state about what is the backend doing right now? which might be waiting or something else. I believe those things are should be separated into several separate pieces of state. It's entirely reasonable to want to know whether we are in the parse phase or plan phase separately from knowing whether we are waiting on an lwlock or not, because then you could (for example) judge what percentage of your lock waits are coming from parsing vs. what fraction are coming from planning, which somebody might care about. Or you might care about ONLY the phase of query processing and not about wait events at all, and then you can ignore one column and look at the other. With the above proposal, those things all get munged together in a way that I think is bound to be awkward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL function to report log message
On Mon, Jul 13, 2015 at 1:29 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 7/13/15 12:39 PM, dinesh kumar wrote: As of now, we don't have an SQL function to write custom/application messages to log destination. We have RAISE clause which is controlled by log_ parameters. If we have an SQL function which works irrespective of log settings, that would be a good for many log parsers. What i mean is, in DBA point of view, if we route all our native OS stats to log files in a proper format, then we can have our log reporting tools to give most effective reports. Also, Applications can log their own messages to postgres log files, which can be monitored by DBAs too. What's the actual use case for this feature other than it would be good to have it? That's a good question Michael. When i was working as a DBA for a different RDBMS, developers used to write some serious APP errors, Followed by instructions into some sort of log and trace files. Since, DBAs didn't have the permission to check app logs, which was owned by Ops team. In my old case, application was having serious OOM issues, which was crashing frequently after the deployment. It wasn't the consistent behavior from the app side, hence they used to sent a copy all APP metrics to trace files, and we were monitoring the DB what was happening during the spike on app servers. Spewing a bunch of stuff into the postgres log doesn't seem like an improvement here. Agreed Jim. I don't really see the point of what you're describing here. Just do something like RAISE WARNING which should normally be high enough to make it into the logs. Or use a pl language that will let you write your own logfile on the server (ie: plperlu). True. Using plperlu, shall we bypass our log_* settings. If it's true, i wasn't sure about it. I didn't mean that, we need to have this feature, since we have it on other RDBMS. I don't see a reason, why don't we have this in our PG too. I see the similar item in our development list http://www.postgresql.org/message-id/53a8e96e.9060...@2ndquadrant.com. That's not at all what that item is talking about. It's talking about exposing ereport as a SQL function, without altering the rest of our logging behavior. Ah. It's' my bad interpretation. Let me work on it, and will send a new patch as a wrapper sql function for ereport. Thanks again. Regards, Dinesh manojadinesh.blogspot.com -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: [HACKERS] [DESIGN] Incremental checksums
On Jul 13, 2015, at 3:50 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 7/13/15 3:26 PM, David Christensen wrote: * Incremental Checksums PostgreSQL users should have a way up upgrading their cluster to use data checksums without having to do a costly pg_dump/pg_restore; in particular, checksums should be able to be enabled/disabled at will, with the database enforcing the logic of whether the pages considered for a given database are valid. Considered approaches for this are having additional flags to pg_upgrade to set up the new cluster to use checksums where they did not before (or optionally turning these off). This approach is a nice tool to have, but in order to be able to support this process in a manner which has the database online while the database is going throught the initial checksum process. It would be really nice if this could be extended to handle different page formats as well, something that keeps rearing it's head. Perhaps that could be done with the cycle idea you've described. I had had this thought too, but the main issues I saw were that new page formats were not guaranteed to take up the same space/storage, so there was an inherent limitation on the ability to restructure things out *arbitrarily*; that being said, there may be a use-case for the types of modifications that this approach *would* be able to handle. Another possibility is some kind of a page-level indicator of what binary format is in use on a given page. For checksums maybe a single bit would suffice (indicating that you should verify the page checksum). Another use case is using this to finally ditch all the old VACUUM FULL code in HeapTupleSatisfies*(). There’s already a page version field, no? I assume that would be sufficient for the page format indicator. I don’t know about using a flag for verifying the checksum, as that is already modifying the page which is to be checksummed anyway, which we want to avoid having to rewrite a bunch of pages unnecessarily, no? And you’d presumably need to clear that state again which would be an additional write. This was the issue that the checksum cycle was meant to handle, since we store this information in the system catalogs and the types of modifications here would be idempotent. David -- David Christensen PostgreSQL Team Manager End Point Corporation da...@endpoint.com 785-727-1171 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DESIGN] Incremental checksums
On 7/13/15 3:26 PM, David Christensen wrote: * Incremental Checksums PostgreSQL users should have a way up upgrading their cluster to use data checksums without having to do a costly pg_dump/pg_restore; in particular, checksums should be able to be enabled/disabled at will, with the database enforcing the logic of whether the pages considered for a given database are valid. Considered approaches for this are having additional flags to pg_upgrade to set up the new cluster to use checksums where they did not before (or optionally turning these off). This approach is a nice tool to have, but in order to be able to support this process in a manner which has the database online while the database is going throught the initial checksum process. It would be really nice if this could be extended to handle different page formats as well, something that keeps rearing it's head. Perhaps that could be done with the cycle idea you've described. Another possibility is some kind of a page-level indicator of what binary format is in use on a given page. For checksums maybe a single bit would suffice (indicating that you should verify the page checksum). Another use case is using this to finally ditch all the old VACUUM FULL code in HeapTupleSatisfies*(). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 07/13/2015 11:53 AM, Marco Atzeri wrote: On 7/13/2015 5:36 PM, Andrew Dunstan wrote: On 07/12/2015 05:06 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: On 07/04/2015 11:02 AM, Tom Lane wrote: It's not apparent to me how that works at all. BTW, the .a files being linked to above are not like Unix .a static archives - they are import library files, which I think they are only used at link time, not run time. The path to the DLLs isn't being hardcoded. Ah, I see. So then what Marco is suggesting is copying a coding pattern that does work. Unless there is further argument I propose to commit something very like Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython No objection here. OK, I tried the attached patch. but when trying to build with python 3 I get this (no such problems with python2, which builds and tests fine): make -C hstore_plpython all make[1]: Entering directory '/home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython' ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I../../src/pl/plpython -I/usr/include/python3.4m -I../../contrib/hstore -I. -I. -I../../src/include -I/usr/include/libxml2 -c -o hstore_plpython.o hstore_plpython.c ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -shared -o hstore_plpython3.dll hstore_plpython.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -Wl,--enable-auto-import -L/usr/lib -L/usr/local/lib -Wl,--as-needed -L../../src/backend -lpostgres -L../hstore -lhstore -L../../src/pl/plpython -lplpython3 -L/usr/lib/python3.4/config-3.4m -lpython3.4m -lpgcommon -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -lcrypt hstore_plpython.o: In function `hstore_to_plpython': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35: undefined reference to `PLyUnicode_FromStringAndSize' [cut] I'd like to get that fixed before applying anything. Marco, any ideas? To build with python 3, set the environment like this: PYTHON=/usr/bin/python3 - this can be done in the config file. I am only building with python2, but on cygwin there is an additional intl library for python3 binding $ python2-config --libs -lpython2.7 -ldl $ python3-config --libs -lpython3.4m -lintl -ldl No this doesn't seem to be the problem. For some reason it's apparently not finding the symbol in plpython3.dll, where it should definitely exist, unless I'm completely off base. So I'm rather confused. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
Andrew Dunstan and...@dunslane.net writes: On 07/13/2015 11:53 AM, Marco Atzeri wrote: On 7/13/2015 5:36 PM, Andrew Dunstan wrote: hstore_plpython.o: In function `hstore_to_plpython': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35: undefined reference to `PLyUnicode_FromStringAndSize' No this doesn't seem to be the problem. For some reason it's apparently not finding the symbol in plpython3.dll, where it should definitely exist, unless I'm completely off base. So I'm rather confused. Could hstore_plpython and plpython somehow have been built with different ideas about PY_MAJOR_VERSION? PLyUnicode_FromStringAndSize is conditionally compiled, and the reference to it from hstore_plpython depends on a conditionally-defined macro, and this error would make plenty of sense if those conditions somehow diverged. So I'd look for instance at whether identical -I paths were used in both parts of the build. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
Andrew Dunstan and...@dunslane.net writes: Not AFAICT. Here is the contrib build: Hm ... what does -DUSE_DL_IMPORT do? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 07/13/2015 07:33 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Not AFAICT. Here is the contrib build: Hm ... what does -DUSE_DL_IMPORT do? NFI - I tried building with that but it made no difference. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On Tue, Jul 14, 2015 at 8:49 AM, Andrew Dunstan and...@dunslane.net wrote: On 07/13/2015 07:33 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Not AFAICT. Here is the contrib build: Hm ... what does -DUSE_DL_IMPORT do? NFI - I tried building with that but it made no difference. Is your python3 build a 32b version? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
On Mon, Jul 13, 2015 at 10:34 PM, Fujii Masao wrote: On Fri, Jul 10, 2015 at 10:06 PM, Beena Emerson wrote: On Tue, Jul 7, 2015 at 2:19 PM, Michael Paquier wrote: Something like pg_syncinfo/ coupled with a LW lock, we already do something similar for replication slots with pg_replslot/. I was trying to figure out how the JSON metadata can be used. It would have to be set using a given set of functions. So we can use only such a set of functions to configure synch rep? I don't like that idea. Because it prevents us from configuring that while the server is not running. If you store a json blob in a set of files of PGDATA you could update them manually there as well. That's perhaps re-inventing the wheel with what is available with GUCs though. Is JSON being preferred because it would be ALTER SYSTEM friendly and in a format already known to users? At least currently ALTER SYSTEM cannot accept the JSON data (e.g., the return value of JSON function like json_build_object()) as the setting value. So I'm not sure how friendly ALTER SYSTEM and JSON format really. If you want to argue that, probably you need to improve ALTER SYSTEM so that JSON can be specified. In a real-life scenario, at most how many groups and nesting would be expected? I don't think that many groups and nestings are common. Yeah, in most common configurations people are not going to have more than 3 groups with only one level of nodes. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 07/12/2015 05:06 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: On 07/04/2015 11:02 AM, Tom Lane wrote: It's not apparent to me how that works at all. BTW, the .a files being linked to above are not like Unix .a static archives - they are import library files, which I think they are only used at link time, not run time. The path to the DLLs isn't being hardcoded. Ah, I see. So then what Marco is suggesting is copying a coding pattern that does work. Unless there is further argument I propose to commit something very like Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython No objection here. OK, I tried the attached patch. but when trying to build with python 3 I get this (no such problems with python2, which builds and tests fine): make -C hstore_plpython all make[1]: Entering directory '/home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython' ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I../../src/pl/plpython -I/usr/include/python3.4m -I../../contrib/hstore -I. -I. -I../../src/include -I/usr/include/libxml2 -c -o hstore_plpython.o hstore_plpython.c ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -shared -o hstore_plpython3.dll hstore_plpython.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -Wl,--enable-auto-import -L/usr/lib -L/usr/local/lib -Wl,--as-needed -L../../src/backend -lpostgres -L../hstore -lhstore -L../../src/pl/plpython -lplpython3 -L/usr/lib/python3.4/config-3.4m -lpython3.4m -lpgcommon -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -lcrypt hstore_plpython.o: In function `hstore_to_plpython': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35: undefined reference to `PLyUnicode_FromStringAndSize' /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35:(.text+0x99): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `PLyUnicode_FromStringAndSize' /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:28: undefined reference to `PLyUnicode_FromStringAndSize' /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:28:(.text+0xf1): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `PLyUnicode_FromStringAndSize' hstore_plpython.o: In function `plpython_to_hstore': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:96: undefined reference to `PLyObject_AsString' /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:96:(.text+0x2cc): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `PLyObject_AsString' /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:84: undefined reference to `PLyObject_AsString' /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:84:(.text+0x321): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `PLyObject_AsString' collect2: error: ld returned 1 exit status ../../src/Makefile.shlib:358: recipe for target 'hstore_plpython3.dll' failed make[1]: *** [hstore_plpython3.dll] Error 1 make[1]: Leaving directory '/home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython' Makefile:92: recipe for target 'all-hstore_plpython-recurse' failed make: *** [all-hstore_plpython-recurse] Error 2 I'd like to get that fixed before applying anything. Marco, any ideas? To build with python 3, set the environment like this: PYTHON=/usr/bin/python3 - this can be done in the config file. cheers andrew diff --git a/contrib/hstore_plperl/Makefile b/contrib/hstore_plperl/Makefile index 19a8ab4..603ef52 100644 --- a/contrib/hstore_plperl/Makefile +++ b/contrib/hstore_plperl/Makefile @@ -30,6 +30,10 @@ override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment SHLIB_LINK += ../hstore/libhstore.a $(wildcard ../../src/pl/plperl/libperl*.a) endif +ifeq ($(PORTNAME), cygwin) +SHLIB_LINK += -L../hstore -l hstore $(perl_embed_ldflags) +endif + # As with plperl we need to make sure that the CORE directory is included # last, probably because it sometimes contains some header files with names # that clash with some of ours, or with some that we include, notably on diff --git a/contrib/hstore_plpython/Makefile b/contrib/hstore_plpython/Makefile index 6ee434b..dfff0fd 100644 --- a/contrib/hstore_plpython/Makefile +++ b/contrib/hstore_plpython/Makefile @@ -28,6 +28,12 @@ ifeq ($(PORTNAME), win32) SHLIB_LINK += ../hstore/libhstore.a $(wildcard
Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape
I wrote: TBH, I think the right thing to do at this point is to revert the entire patch and send it back for ground-up rework. I think the high-level design is wrong in many ways and I have about zero confidence in most of the code details as well. I'll send a separate message about high-level issues, And here's that. I do *not* claim that this is a complete list of design issues with the patch, it's just things I happened to notice in the amount of time I've spent so far (which is already way more than I wanted to spend on TABLESAMPLE right now). I'm not sure that we need an extensible sampling-method capability at all, let alone that an API for that needs to be the primary focus of a patch. Certainly the offered examples of extension modules aren't inspiring: tsm_system_time is broken by design (more about that below) and nobody would miss tsm_system_rows if it weren't there. What is far more important is to get sampling capability into indexscans, and designing an API ahead of time seems like mostly a distraction from that. I'd think seriously about tossing the entire executor-stage part of the patch, creating a stateless (history-independent) sampling filter that just accepts or rejects TIDs, and sticking calls to that into all the table scan node types. Once you had something like that working well it might be worth considering whether to try to expose an API to generalize it. But even then it's not clear that we really need any more options than true-Bernoulli and block-level sampling. The IBM paper I linked to in the other thread mentions that their optimizer will sometimes choose to do Bernoulli sampling even if SYSTEM was requested. Probably that happens when it decides to do a simple indexscan, because then there's no advantage to trying to cluster the sampled rows. But in the case of a bitmap scan, you could very easily do either true Bernoulli or block-level sampling simply by adjusting the rules about which bits you keep or clear in the bitmap (given that you apply the filter between collection of the index bitmap and accessing the heap, which seems natural). The only case where a special scan type really seems to be needed is if you want block-level sampling, the query would otherwise use a seqscan, *and* the sampling percentage is pretty low --- if you'd be reading every second or third block anyway, you're likely better off with a plain seqscan so that the kernel sees sequential access and does prefetching. The current API doesn't seem to make it possible to switch between seqscan and read-only-selected-blocks based on the sampling percentage, but I think that could be an interesting optimization. (Another bet that's been missed is having the samplescan logic request prefetching when it is doing selective block reads. The current API can't support that AFAICS, since there's no expectation that nextblock calls could be done asynchronously from nexttuple calls.) Another issue with the API as designed is the examinetuple() callback. Allowing sample methods to see invisible tuples is quite possibly the worst idea in the whole patch. They can *not* do anything with such tuples, or they'll totally break reproducibility: if the tuple is invisible to your scan, it might well be or soon become invisible to everybody, whereupon it would be subject to garbage collection at the drop of a hat. So if an invisible tuple affects the sample method's behavior at all, repeated scans in the same query would not produce identical results, which as mentioned before is required both by spec and for minimally sane query behavior. Moreover, examining the contents of the tuple is unsafe (it might contain pointers to TOAST values that no longer exist); and even if it were safe, what's the point? Sampling that pays attention to the data is the very definition of biased. So if we do re-introduce an API like the current one, I'd definitely lose this bit and only allow sample methods to consider visible tuples. On the point of reproducibility: the tsm_system_time method is utterly incapable of producing identical results across repeated scans, because there is no reason to believe it would get exactly as far in the same amount of time each time. This might be all right across queries if the method could refuse to work with REPEATABLE clauses (but there's no provision for such a restriction in the current API). But it's not acceptable within a query. Another problem with tsm_system_time is that its cost/rowcount estimation is based on nothing more than wishful thinking, and can never be based on anything more than wishful thinking, because planner cost units are not time-based. Adding a comment that says we'll nonetheless pretend they are milliseconds isn't a solution. So that sampling method really has to go away and never come back, whatever we might ultimately salvage from this patch otherwise. (I'm not exactly convinced that the system or tsm_system_rows methods are adequately
Re: [HACKERS] intarray planning/exeuction problem with multiple operators
Jeff Janes jeff.ja...@gmail.com writes: I've found an interesting performance problem in the intarray extension module. It doesn't seem to be version dependent, I've verified it in 9.4.4 and 9.6devel. Seems like this isn't specifically an issue for intarray, but rather one with the core GIN code not being smart about the combination of selective and unselective index conditions. In particular, it seems like the smart thing for GIN to do with this example is just ignore the @@ condition altogether so far as the index search goes, and mark all the results as needing recheck so that the @@ operator gets applied as a filter. You could also complain about the core planner not considering leaving some potentially-indexable quals out of the actual index condition, but TBH I don't see that that would be a useful use of planner cycles. In almost every case it would mean that if there are K quals potentially usable with a given index, we'd cost out 2^K-1 index paths and immediately reject all but the use-em-all alternative. That's a lot of cycles to spend to handle a corner case. It's always been assumed to be the index AM's problem to optimize use of the index quals it's handed, and I don't see why that shouldn't be true here. The proposed selectivity estimate improvement patch for @@ does not fix this (and evaluating that patch was how I found this issue.) Nah, it wouldn't: the cost estimates are correct, in the sense that gincostestimate realizes that GIN will be horribly stupid about this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape
On 13 July 2015 at 17:00, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: TBH, I think the right thing to do at this point is to revert the entire patch and send it back for ground-up rework. I think the high-level design is wrong in many ways and I have about zero confidence in most of the code details as well. I'll send a separate message about high-level issues, And here's that. I do *not* claim that this is a complete list of design issues with the patch, it's just things I happened to notice in the amount of time I've spent so far (which is already way more than I wanted to spend on TABLESAMPLE right now). Interesting that a time-based sample does indeed yield useful results. Good to know. That is exactly what this patch provides: a time-based sample, with reduced confidence in the accuracy of the result. And other things too. I'm not sure that we need an extensible sampling-method capability at all, let alone that an API for that needs to be the primary focus of a patch. Certainly the offered examples of extension modules aren't inspiring: tsm_system_time is broken by design (more about that below) and nobody would miss tsm_system_rows if it weren't there. Time based sampling isn't broken by design. It works exactly according to the design. Time-based sampling has been specifically requested by data visualization developers who are trying to work out how to display anything useful from a database beyond a certain size. The feature for PostgreSQL implements a similar mechanism to that deployed already with BlinkDB, demonstrated at VLDB 2012. I regard it as an Advanced feature worthy of deployment within PostgreSQL, based upon user request. Row based sampling offers the capability to retrieve a sample of a fixed size however big the data set. I shouldn't need to point out that this is already in use within the ANALYZE command. Since it implements a technique already in use within PostgreSQL, I see no reason not to expose that to users. As I'm sure you're aware, there is rigorous math backing up the use of fixed size sampling, with recent developments from my research colleagues at Manchester University that further emphasises the usefulness of this feature for us. What is far more important is to get sampling capability into indexscans, and designing an API ahead of time seems like mostly a distraction from that. This has come up as a blocker on TABLESAMPLE before. I've got no evidence to show anyone cares about that. I can't imagine a use for it myself; it was not omitted from this patch because its difficult, it was omitted because its just useless. If anyone ever cares, they can add it in a later release. I'd think seriously about tossing the entire executor-stage part of the patch, creating a stateless (history-independent) sampling filter that just accepts or rejects TIDs, and sticking calls to that into all the table scan node types. Once you had something like that working well it might be worth considering whether to try to expose an API to generalize it. But even then it's not clear that we really need any more options than true-Bernoulli and block-level sampling. See above. The IBM paper I linked to in the other thread mentions that their optimizer will sometimes choose to do Bernoulli sampling even if SYSTEM was requested. That sounds broken to me. This patch gives the user what the user asks for. BERNOULLI or SYSTEM. If you particularly like the idea of mixing the two sampling methods, you can do so *because* the sampling method API is extensible for the user. So Mr.DB2 can get it his way if he likes and he can call it SYSNOULLI if he likes; others can get it the way they like also. No argument required. It was very, very obvious that whatever sampling code anybody dreamt up would be objected to. Clearly, we need a way to allow people to implement whatever technique they wish. Stratified sampling, modified sampling, new techniques. Whatever. Probably that happens when it decides to do a simple indexscan, because then there's no advantage to trying to cluster the sampled rows. But in the case of a bitmap scan, you could very easily do either true Bernoulli or block-level sampling simply by adjusting the rules about which bits you keep or clear in the bitmap (given that you apply the filter between collection of the index bitmap and accessing the heap, which seems natural). The only case where a special scan type really seems to be needed is if you want block-level sampling, the query would otherwise use a seqscan, *and* the sampling percentage is pretty low --- if you'd be reading every second or third block anyway, you're likely better off with a plain seqscan so that the kernel sees sequential access and does prefetching. The current API doesn't seem to make it possible to switch between seqscan and read-only-selected-blocks based on the sampling percentage, but I think that could be an interesting optimization.
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 07/13/2015 05:18 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: On 07/13/2015 11:53 AM, Marco Atzeri wrote: On 7/13/2015 5:36 PM, Andrew Dunstan wrote: hstore_plpython.o: In function `hstore_to_plpython': /home/andrew/bf64/root/HEAD/pgsql/contrib/hstore_plpython/hstore_plpython.c:35: undefined reference to `PLyUnicode_FromStringAndSize' No this doesn't seem to be the problem. For some reason it's apparently not finding the symbol in plpython3.dll, where it should definitely exist, unless I'm completely off base. So I'm rather confused. Could hstore_plpython and plpython somehow have been built with different ideas about PY_MAJOR_VERSION? PLyUnicode_FromStringAndSize is conditionally compiled, and the reference to it from hstore_plpython depends on a conditionally-defined macro, and this error would make plenty of sense if those conditions somehow diverged. So I'd look for instance at whether identical -I paths were used in both parts of the build. Not AFAICT. Here is the contrib build: ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I../../src/pl/plpython -I/usr/include/python3.4m -I../../contrib/hstore -I. -I. -I../../src/include -I/usr/include/libxml2 -c -o hstore_plpython.o hstore_plpython.c ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -shared -o hstore_plpython3.dll hstore_plpython.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -Wl,--enable-auto-import -L/usr/lib -L/usr/local/lib -Wl,--as-needed -L../../src/backend -lpostgres -L../hstore -lhstore -L../../src/pl/plpython -lplpython3 -L/usr/lib/python3.4/config-3.4m -lpython3.4m -lpgcommon -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -lcrypt and here is the plpython build: ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -I. -I. -I/usr/include/python3.4m -I../../../src/include -I/usr/include/libxml2 -DUSE_DL_IMPORT -c -o plpy_util.o plpy_util.c ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -O2 -shared -o plpython3.dll plpy_cursorobject.o plpy_elog.o plpy_exec.o plpy_main.o plpy_planobject.o plpy_plpymodule.o plpy_procedure.o plpy_resultobject.o plpy_spi.o plpy_subxactobject.o plpy_typeio.o plpy_util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -Wl,--enable-auto-import -L/usr/lib -L/usr/local/lib -Wl,--as-needed -L/usr/lib/python3.4/config-3.4m -lpython3.4m -lintl -ldl -L../../../src/backend -lpostgres -lpgcommon -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -lcrypt The functions are in fact apparently built - the names are present in the object file and the DLL. cheers andrew regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Forensic recovery deleted pgdump custom format file
Hello. I need some help. I have the following situation. My client deleted a number of old backups from a drive disc made by PGDUMP with custom flag activated. I could not find any program to recover backup files made by PGDUMP of customized / binary form. So I decided to try to understand the file format generated by pgdump. Analyzing the source code of pgdump/recovery, i concluded a few things: The header of the file always starts with PGDMP followed by pgdump version number used, followed by int size, offset, etc. followed by TOCs. My question is how to know the end of the file? Are there any signature that I can use? Or would have to analyze the whole file? Thank u. -- David
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 07/13/2015 07:55 PM, Michael Paquier wrote: On Tue, Jul 14, 2015 at 8:49 AM, Andrew Dunstan and...@dunslane.net wrote: On 07/13/2015 07:33 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Not AFAICT. Here is the contrib build: Hm ... what does -DUSE_DL_IMPORT do? NFI - I tried building with that but it made no difference. Is your python3 build a 32b version? No, this is all 64 bit. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forensic recovery deleted pgdump custom format file
On Tue, Jul 14, 2015 at 9:28 AM, David Guimaraes skys...@gmail.com wrote: So I decided to try to understand the file format generated by pgdump. Analyzing the source code of pgdump/recovery, i concluded a few things: The header of the file always starts with PGDMP followed by pgdump version number used, followed by int size, offset, etc. followed by TOCs. My question is how to know the end of the file? Are there any signature that I can use? Or would have to analyze the whole file? Why are you trying to reinvent the wheel? pg_restore is not available? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for N synchronous standby servers - take 2
On Tue, Jul 14, 2015 at 9:00 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Jul 13, 2015 at 10:34 PM, Fujii Masao wrote: On Fri, Jul 10, 2015 at 10:06 PM, Beena Emerson wrote: On Tue, Jul 7, 2015 at 2:19 PM, Michael Paquier wrote: Something like pg_syncinfo/ coupled with a LW lock, we already do something similar for replication slots with pg_replslot/. I was trying to figure out how the JSON metadata can be used. It would have to be set using a given set of functions. So we can use only such a set of functions to configure synch rep? I don't like that idea. Because it prevents us from configuring that while the server is not running. If you store a json blob in a set of files of PGDATA you could update them manually there as well. That's perhaps re-inventing the wheel with what is available with GUCs though. Why don't we just use GUC? If the quorum setting is not so complicated in real scenario, GUC seems enough for that. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing vacuum truncation scans
On Mon, Jul 13, 2015 at 12:06 PM, Haribabu Kommi kommi.harib...@gmail.com wrote: On Thu, Jul 9, 2015 at 5:36 PM, Haribabu Kommi kommi.harib...@gmail.com wrote: I will do some performance tests and send you the results. Here are the performance results tested on my machine. Head vm patchvm+prefetch patch First vacuum120sec1sec 1sec second vacuum180 sec 180 sec30 sec I did some modifications in the code to skip the vacuum truncation by the first vacuum command. This way I collected the second vacuum time taken performance. I just combined your vm and prefetch patch into a single patch vm+prefetch patch without a GUC. I kept the prefetch as 32 and did the performance test. I chosen prefetch based on the current buffer access strategy, which is 32 for vacuum presently instead of an user option. Here I attached the modified patch with both vm+prefetch logic. I will do some tests on a machine with SSD and let you know the results. Based on these results, we can decide whether we need a GUC or not? based on the impact of prefetch on ssd machines. Following are the performance readings on a machine with SSD. I increased the pgbench scale factor to 1000 in the test instead of 500 to show a better performance numbers. Head vm patchvm+prefetch patch First vacuum6.24 sec 2.91 sec 2.91 sec second vacuum6.66 sec 6.66 sec 7.19 sec There is a small performance impact on SSD with prefetch. Regards, Hari Babu Fujitsu Australia -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] intarray planning/exeuction problem with multiple operators
I've found an interesting performance problem in the intarray extension module. It doesn't seem to be version dependent, I've verified it in 9.4.4 and 9.6devel. If I do a query that involves both an = op and a @@ op ANDed together, it gives a high cost estimate, which is justified by the slow runtime. If I omit the @@ it gives a low estimate, also justified. If I trick it into thinking it cannot use the index to satisfy the @@ op, then it gives a low estimate and low runtime, applying the @@ in the filter step and only the fast = in the bitmap index scan. Now it could use the index for the fast = operation and apply the @@ in the recheck, rather than the filter. But I guess it doesn't think of that, despite knowing that applying the @@ in index operation is slow. So it seems to be missing a trick someplace, but I don't if it reasonable to expect it to find that trick, or how easy/hard it would be to implement. The proposed selectivity estimate improvement patch for @@ does not fix this (and evaluating that patch was how I found this issue.) Set up: create table foobar as select ( select array_agg(floor(sqrt(random()*1000+g.y/100+f.x/1000))::int) from generate_series(1,10) g(y) ) foo from generate_series(1,1000) f(x); create index on foobar using gin(foo gin__int_ops); analyze; You can knock an order of magnitude off from the table size and should still be able to see the problem. example: explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo @@ '!1'::query_int; QUERY PLAN --- Seq Scan on foobar (cost=0.00..263637.00 rows=1 width=61) (actual time=9717.957..9717.957 rows=0 loops=1) Filter: ((foo @@ '!1'::query_int) AND (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])) Rows Removed by Filter: 1000 Buffers: shared hit=64 read=113573 I/O Timings: read=361.402 Planning time: 0.101 ms Execution time: 9717.991 ms (7 rows) explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' ; QUERY PLAN - Bitmap Heap Scan on foobar (cost=112.01..116.02 rows=1 width=61) (actual time=0.027..0.027 rows=0 loops=1) Recheck Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]) Buffers: shared hit=21 - Bitmap Index Scan on foobar_foo_idx (cost=0.00..112.01 rows=1 width=0) (actual time=0.023..0.023 rows=0 loops=1) Index Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]) Buffers: shared hit=21 Planning time: 0.097 ms Execution time: 0.061 ms If I trick it into thinking the @@ operator cannot be used in th eindex, then it gets really fast again: explain(analyze, buffers) select * from foobar where foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo||'{}' @@ '!1'::query_int; QUERY PLAN - Bitmap Heap Scan on foobar (cost=112.01..116.03 rows=1 width=61) (actual time=0.082..0.082 rows=0 loops=1) Recheck Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]) Filter: ((foo || '{}'::integer[]) @@ '!1'::query_int) Buffers: shared hit=21 - Bitmap Index Scan on foobar_foo_idx (cost=0.00..112.01 rows=1 width=0) (actual time=0.080..0.080 rows=0 loops=1) Index Cond: (foo = '{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]) Buffers: shared hit=21 Planning time: 0.139 ms Execution time: 0.129 ms Cheers, Jeff
Re: [HACKERS] [PATCH] Generalized JSON output functions
On Sun, Jul 12, 2015 at 8:39 PM, Pavel Stehule pavel.steh...@gmail.com wrote: The thing is - it's not only about whitespace, otherwise I would probably not bother with the generic interface. For my original problem, there is simply no way to do this correctly in an extension w/o copying over all of the logic from json.c, which I have to do right now, would rather not. I am sorry - we are talking about JSON, not about any styled document. I disagree, so it has not be implemented as extension - the backport of JSON support is a extension. Hm... I'm having a hard time making sense of that statement, sorry. To reiterate: for my problem, that is escaping numerics that can potentially overflow[1] under ECMAScript standard, I want to be able to override the code that outputs the numeric converted to string. There is no way in current implementation to do that *at all*, short of copying all the code involved in producing JSON output and changing it at certain points. One could try re-parsing JSON instead, but that doesn't actually solve the issue, because type information is lost forever at that point. The whitespace unification was a mere side-effect of the original effort on this patch. -- Best regards, Alex [1] http://stackoverflow.com/questions/307179/what-is-javascripts-highest-integer-value-that-a-number-can-go-to-without-losin
Re: [HACKERS] [PATCH] Generalized JSON output functions
2015-07-13 9:30 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de: On Sun, Jul 12, 2015 at 8:39 PM, Pavel Stehule pavel.steh...@gmail.com wrote: The thing is - it's not only about whitespace, otherwise I would probably not bother with the generic interface. For my original problem, there is simply no way to do this correctly in an extension w/o copying over all of the logic from json.c, which I have to do right now, would rather not. I am sorry - we are talking about JSON, not about any styled document. I disagree, so it has not be implemented as extension - the backport of JSON support is a extension. Hm... I'm having a hard time making sense of that statement, sorry. To reiterate: for my problem, that is escaping numerics that can potentially overflow[1] under ECMAScript standard, I want to be able to override the code that outputs the numeric converted to string. There is no way in current implementation to do that *at all*, short of copying all the code involved in producing JSON output and changing it at certain points. One could try re-parsing JSON instead, but that doesn't actually solve the issue, because type information is lost forever at that point. The whitespace unification was a mere side-effect of the original effort on this patch. The dynamic type change is some what I would not to do in database, really :) If you afraid about overflow, then convert numeric to string immediately - in this case, the client have to support both variant - so immediate cast should not be a problem. Anyway this check on max number should be implemented in our JSON(b) out functions (as warning?). Regards Pavel -- Best regards, Alex [1] http://stackoverflow.com/questions/307179/what-is-javascripts-highest-integer-value-that-a-number-can-go-to-without-losin
[HACKERS] [PATCH] SQL function to report log message
Hi All, Greetings for the day. Would like to discuss on below feature here. Feature: Having an SQL function, to write messages to log destination. Justification: As of now, we don't have an SQL function to write custom/application messages to log destination. We have RAISE clause which is controlled by log_ parameters. If we have an SQL function which works irrespective of log settings, that would be a good for many log parsers. What i mean is, in DBA point of view, if we route all our native OS stats to log files in a proper format, then we can have our log reporting tools to give most effective reports. Also, Applications can log their own messages to postgres log files, which can be monitored by DBAs too. Implementation: Implemented a new function pg_report_log which takes one argument as text, and returns void. I took, LOG prefix for all the reporting messages.I wasn't sure to go with new prefix for this, since these are normal LOG messages. Let me know, if i am wrong here. Here is the attached patch. Regards, Dinesh manojadinesh.blogspot.com diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 76f77cb..b2fc4cd 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17850,6 +17850,15 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); Return information about a file. /entry /row + row + entry +literalfunctionpg_report_log(parametermessage/ typetext/type])/function/literal + /entry + entrytypevoid/type/entry + entry +Write message into log file. + /entry + /row /tbody /tgroup /table @@ -17918,6 +17927,18 @@ SELECT (pg_stat_file('filename')).modification; /programlisting /para + indexterm +primarypg_report_log/primary + /indexterm + para +functionpg_report_log/ is useful to write custom messages +into current log destination and returns typevoid/type. +Typical usages include: +programlisting +SELECT pg_report_log('Message'); +/programlisting + /para + /sect2 sect2 id=functions-advisory-locks diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c index c0495d9..6c54f3a 100644 --- a/src/backend/utils/adt/misc.c +++ b/src/backend/utils/adt/misc.c @@ -76,6 +76,23 @@ current_query(PG_FUNCTION_ARGS) } /* + * pg_report_log() + * + * Printing custom log messages in log file. + */ + +Datum +pg_report_log(PG_FUNCTION_ARGS) +{ + + ereport(MESSAGE, + (errmsg(%s, text_to_cstring(PG_GETARG_TEXT_P(0))), + errhidestmt(true))); + + PG_RETURN_VOID(); +} + +/* * Send a signal to another backend. * * The signal is delivered if the user is either a superuser or the same diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 088c714..2e8f547 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -302,7 +302,7 @@ errstart(int elevel, const char *filename, int lineno, elevel == INFO); } - /* Skip processing effort if non-error message will not be output */ + /* Skip processing effort if non-error,custom message will not be output */ if (elevel ERROR !output_to_server !output_to_client) return false; @@ -2062,6 +2062,7 @@ write_eventlog(int level, const char *line, int len) case DEBUG3: case DEBUG2: case DEBUG1: + case MESSAGE: case LOG: case COMMERROR: case INFO: @@ -2917,6 +2918,7 @@ send_message_to_server_log(ErrorData *edata) case DEBUG1: syslog_level = LOG_DEBUG; break; + case MESSAGE: case LOG: case COMMERROR: case INFO: @@ -3547,6 +3549,7 @@ error_severity(int elevel) case DEBUG5: prefix = _(DEBUG); break; + case MESSAGE: case LOG: case COMMERROR: prefix = _(LOG); @@ -3666,6 +3669,9 @@ is_log_level_output(int elevel, int log_min_level) /* Neither is LOG */ else if (elevel = log_min_level) return true; + /* If elevel is MESSAGE, then ignore log settings */ + else if (elevel == MESSAGE) + return true; return false; } diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 6fd1278..62c619a 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -5344,6 +5344,11 @@ DESCR(tsm_bernoulli_reset(internal)); DATA(insert OID = 3346 ( tsm_bernoulli_cost PGNSP PGUID 12 1 0 0 0 f f f f t f v 7 0 2278 2281 2281 2281 2281 2281 2281 2281
Re: [HACKERS] [PATCH] SQL function to report log message
On Mon, Jul 13, 2015 at 4:54 PM, dinesh kumar dineshkuma...@gmail.com wrote: Would like to discuss on below feature here. Feature: Having an SQL function, to write messages to log destination. Justification: As of now, we don't have an SQL function to write custom/application messages to log destination. We have RAISE clause which is controlled by log_ parameters. If we have an SQL function which works irrespective of log settings, that would be a good for many log parsers. What i mean is, in DBA point of view, if we route all our native OS stats to log files in a proper format, then we can have our log reporting tools to give most effective reports. Also, Applications can log their own messages to postgres log files, which can be monitored by DBAs too. What's the actual use case for this feature other than it would be good to have it? A log message is here to give information about the state of something that happens in backend, but in the case of this function the event that happens is the content of the function itself. It also adds a new log level for something that has a unique usage, which looks like an overkill to me. Btw, you could do something more advanced with simply an extension if you really want to play with this area... But I am dubious about what kind of applications would use that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers