Re: [HACKERS] Rowcounts marked by create_foreignscan_path()
(2014/02/18 12:37), Tom Lane wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: (2014/02/18 12:03), Tom Lane wrote: The calling FDW is supposed to do that; note the header comment. However, ISTM postgresGetForeignPaths() doesn't work like that. It uses the same rowcount for all paths of a same parameterization? That's what we want no? Maybe I'm missing something. But I don't think postgresGetForeignPaths() marks the parameterized path with the correct row estimate. Also, that function doesn't seem to estimate the cost of the parameterized path correctly. Please find attached a patch. Anyway, the point of using ppi_rows would be to enforce that all the rowcount estimates for a given parameterized relation are the same. In the FDW case, all those estimates are the FDW's responsibility, and so making them consistent is also its responsibility IMO. Another way of looking at this is that none of the pathnode creation routines in pathnode.c are responsible for setting rowcount estimates. That's done by whatever is setting the cost estimate; this must be so, else the cost estimate is surely bogus. So any way you slice it, the FDW has to get it right. Understood. Thank you for the analysis. Sorry for the delay. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 295,300 static bool postgresAnalyzeForeignTable(Relation relation, --- 295,301 static void estimate_path_cost_size(PlannerInfo *root, RelOptInfo *baserel, List *join_conds, + List *param_conds, double *p_rows, int *p_width, Cost *p_startup_cost, Cost *p_total_cost); static void get_remote_estimate(const char *sql, *** *** 493,499 postgresGetForeignRelSize(PlannerInfo *root, * values in fpinfo so we don't need to do it again to generate the * basic foreign path. */ ! estimate_path_cost_size(root, baserel, NIL, fpinfo-rows, fpinfo-width, fpinfo-startup_cost, fpinfo-total_cost); --- 494,500 * values in fpinfo so we don't need to do it again to generate the * basic foreign path. */ ! estimate_path_cost_size(root, baserel, NIL, NIL, fpinfo-rows, fpinfo-width, fpinfo-startup_cost, fpinfo-total_cost); *** *** 523,529 postgresGetForeignRelSize(PlannerInfo *root, set_baserel_size_estimates(root, baserel); /* Fill in basically-bogus cost estimates for use later. */ ! estimate_path_cost_size(root, baserel, NIL, fpinfo-rows, fpinfo-width, fpinfo-startup_cost, fpinfo-total_cost); } --- 524,530 set_baserel_size_estimates(root, baserel); /* Fill in basically-bogus cost estimates for use later. */ ! estimate_path_cost_size(root, baserel, NIL, NIL, fpinfo-rows, fpinfo-width, fpinfo-startup_cost, fpinfo-total_cost); } *** *** 541,547 postgresGetForeignPaths(PlannerInfo *root, --- 542,550 PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel-fdw_private; ForeignPath *path; List *join_quals; + List *param_quals; Relids required_outer; + ParamPathInfo *param_info; double rows; int width; Coststartup_cost; *** *** 591,604 postgresGetForeignPaths(PlannerInfo *root, if (!is_foreign_expr(root, baserel, rinfo-clause)) continue; - /* -* OK, get a cost estimate from the remote, and make a path. -*/ - join_quals = list_make1(rinfo); - estimate_path_cost_size(root, baserel, join_quals, - rows, width, - startup_cost, total_cost); - /* Must calculate required outer rels for this path */ required_outer = bms_union(rinfo-clause_relids,
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Hi, Attached is a patch with the updated documentation (now uses consistently huge pages) as well as a renamed GUC, consistent wording (always use huge pages) as well as renamed variables. Hmm, I wonder if that could now be misunderstood to have something to do with the PostgreSQL page size? Maybe add the word memory or operating system in the first sentence in the docs, like this: Enables/disables the use of huge memory pages. Accepted, see attached patch. para At present, this feature is supported only on Linux. The setting is ignored on other systems when set to literaltry/literal. productnamePostgreSQL/productname will refuse to start when set to literalon/literal. /para Is it clear enough that PostgreSQL will only refuse to start up when it's set to on, *if the feature's not supported on the platform*? Perhaps just leave that last sentence out. It's mentioned later that With literalon/literal, failure to use huge pages will prevent the server from starting up., that's probably enough. Fixed. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index fa9ee37..065c1db 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1166,14 +1166,14 @@ include 'filename' /listitem /varlistentry - varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages - termvarnamehuge_tlb_pages/varname (typeenum/type)/term + varlistentry id=guc-huge-pages xreflabel=huge_pages + termvarnamehuge_pages/varname (typeenum/type)/term indexterm - primaryvarnamehuge_tlb_pages/ configuration parameter/primary + primaryvarnamehuge_pages/ configuration parameter/primary /indexterm listitem para -Enables/disables the use of huge TLB pages. Valid values are +Enables/disables the use of huge memory pages. Valid values are literaltry/literal (the default), literalon/literal, and literaloff/literal. /para @@ -1181,19 +1181,16 @@ include 'filename' para At present, this feature is supported only on Linux. The setting is ignored on other systems when set to literaltry/literal. -productnamePostgreSQL/productname will -refuse to start when set to literalon/literal. /para para -The use of huge TLB pages results in smaller page tables and -less CPU time spent on memory management, increasing performance. For -more details, see -xref linkend=linux-huge-tlb-pages. +The use of huge pages results in smaller page tables and less CPU time +spent on memory management, increasing performance. For more details, +see xref linkend=linux-huge-pages. /para para -With varnamehuge_tlb_pages/varname set to literaltry/literal, +With varnamehuge_pages/varname set to literaltry/literal, the server will try to use huge pages, but fall back to using normal allocation if that fails. With literalon/literal, failure to use huge pages will prevent the server from starting up. With diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 5f9fa61..7f4a235 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -1308,13 +1308,13 @@ echo -1000 /proc/self/oom_score_adj /note /sect2 - sect2 id=linux-huge-tlb-pages - titleLinux huge TLB pages/title + sect2 id=linux-huge-pages + titleLinux huge pages/title para -Using huge TLB pages reduces overhead when using large contiguous chunks -of memory, like PostgreSQL does. To enable this feature -in productnamePostgreSQL/productname you need a kernel +Using huge pages reduces overhead when using large contiguous chunks of +memory, like productnamePostgreSQL/productname does. To enable this +feature in productnamePostgreSQL/productname you need a kernel with varnameCONFIG_HUGETLBFS=y/varname and varnameCONFIG_HUGETLB_PAGE=y/varname. You also have to tune the system setting varnamevm.nr_hugepages/varname. To estimate the number of @@ -1345,7 +1345,7 @@ $ userinputsysctl -w vm.nr_hugepages=3170/userinput productnamePostgreSQL/productname is to use them when possible and to fallback to normal pages when failing. To enforce the use of huge pages, you can set -link linkend=guc-huge-tlb-pagesvarnamehuge_tlb_pages/varname/link +link linkend=guc-huge-pagesvarnamehuge_pages/varname/link to literalon/literal. Note that in this case productnamePostgreSQL/productname will fail to start if not enough huge pages are available. diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c index 65ad595..51c1a2b 100644 --- a/src/backend/port/sysv_shmem.c +++
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Hi, On 28/02/14 17:58, Peter Geoghegan wrote: On Fri, Feb 28, 2014 at 9:43 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Hmm, I wonder if that could now be misunderstood to have something to do with the PostgreSQL page size? Maybe add the word memory or operating system in the first sentence in the docs, like this: Enables/disables the use of huge memory pages. Whenever I wish to emphasize that distinction, I tend to use the term MMU pages. I don't like to distinct that much from Linux terminology, this may lead to confusion. And to use this term only in one place doesn't seem to make sense, too – naming will then be inconsistent and thus lead to confusion, too. Do you agree? Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services pgpkad_A3izOE.pgp Description: PGP signature
Re: [HACKERS] UNION ALL on partitioned tables won't use indices.
Hello, Yes, the old dumped version of typ2 patch did so. It flattened appendrel tree for the query prpoerly. Let me hear the reson you prefer to do so. Having reviewed my upthread reasoning for preferring one of those two approaches over the other, it's a weak preference. They have similar runtime costs. Putting the logic with the code that creates appendrels reduces the number of code sites one must examine to reason about possible plan structures. We might not flatten RTE_RELATION appendrel parents exactly the same way we flatten RTE_SUBQUERY appendrel parents. I would tend to leave inh=true for the former, for reasons explained in my notes on v7, but set inh=false for the latter to save needless work. Unfortunately, RTE_SUBQUERY-es with their inh flag cleard might cause something inconvenient in preprocess_minmax_aggregates() and/or add_rtes_to_flat_rtable().. # I haven't found that related to sepgsql, though :-( I understood that your concern is to deal parent RTEs defferently according to their relkinds. But I think the inh flags could not be modified at all for the reason mentioned above. Finally the seemingly most safe way to exclude removed intermediate RTEs in subsequent processing is simplly remove apprelinfos directly linked to them (as did in v7), for both of the parents, RTE_RELATION and RTE_SUBQUERY. The difference has disappeared in this story. At least as of now, inheritance tables define the bottom bound of a appendrel tree and on the other hand complex(?) union_alls define the upper bound, and both multilevel (simple)union_alls and inheritances are flattened individually so all possible inheritance tree to be collapsed by this patch is only, I think, Subquery(inh=1)[Relation-inhparent [Relation-child, child, child]] On the other hand, a flattening pass is less code overall and brings an attractive uniformity-by-default to the area. Focusing only on the above structure, what we should do to collapse this tree is only connect the childs to the Subquery directly, then remove all appendrelinfos connecting to the Relation-inhparent. inh flag need not to be modified. # Umm. I strongly suspect that I overlooked something.. Then even widening to general case, the case doesn't seem to change. All we need to do is, link a child to its grandparent and isolate the parent removing apprelinfos. Thoughts? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Get more from indices.
I marked this patch as 'Ready for Committer' by myself according to the following discussion. Thanks. At Tue, 25 Feb 2014 16:35:55 -0500, Robert Haas wrote On Tue, Feb 25, 2014 at 3:36 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: May I mark this patch as Ready for Committer by myself since this was already marked so in last CF3 and have had no objection or new suggestion in this CF4 for more than a month? Sounds fair. Thank you, I will send this patch to commtters after waiting another few days for more suggestions. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On 2014-02-28 20:55:20 +0530, Amit Kapila wrote: On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse I wouldn't be inclined to dump the whole tuple under any circumstances. That could be a lot more data than what you want dumped in your log. The PK could already be somewhat unreasonably large, but the whole tuple could be a lot more unreasonably large. Well, as I already stated: we don't. I copied the behavior we use in CHECK constraints (ExecBuildSlotValueDescription()). I think now more people are of opinion that displaying whole tuple is not useful. I believe it is good to go ahead by displaying just primary key for this case and move ahead. The patch does not, and has never, printed the full tuple. What it does is copying the behaviour of printing the tuple for check constraint violations, which is truncating the tuple after a certain length. I don't think changing this in an isolated manner to the primary key would be a good idea. Let's use the same logic here, and if somebody wants to argue for always using the primary key instead of a truncated tuple, let them submit a separate patch. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On 2014-03-01 13:29:18 +0530, Amit Kapila wrote: With new patch, the message while updating locked rows will be displayed as below: LOG: process 364 still waiting for ShareLock on transaction 678 after 1014.000ms CONTEXT: while attempting to lock tuple (0,2) with values (2) in relation publ ic.t1 of database postgres LOG: process 364 acquired ShareLock on transaction 678 after 60036.000 ms CONTEXT: while attempting to lock tuple (0,2) with values (2) in relation publ ic.t1 of database postgres Now I am not sure, if the second message is an improvement, as what it sounds is while attempting to lock tuple, it got shared lock on transaction'. If you, Robert or other feels it is okay, then we can retain it as it is in patch else I think either we need to rephrase it or may be try with some other way (global variable) such that it appears only for required case. I feel the way Robert has suggested i.e to make it as Detail of particular message (we might need to use global variable to pass certain info) is better way and will have minimal impact on the cases where this additional information needs to be displayed. I really don't understand the origins of your arguments here. Why shouldn't a row lock caused by an UPDATE be relevant? It's currenty very hard to debug those, just as it's hard to debug tuple/row locks caused by explicit FOR SHARE/UPDATE/... And if your argument is that the message might be displayed when a explicit query cancel (statement timeout) instead of a deadlock_timeout arrives, so what? It's still correct, and important information? After all it seems to have waited long enough to get cancelled. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal
I'm applying for GSoC 2014 with Postgresql and would appreciate your comments on my proposal (attached). I'm looking for technical corrections/comments and your opinions on the project's viability. In particular, if the community has doubts about its usefulness, I would start working on an extra proposal from https://wiki.postgresql.org/wiki/GSoC_2014, perhaps on the RETURNING clause as a student named Karlik did last year. I am sure that Simon had his reasons when he proposed http://www.postgresql.org/message-id/CA+U5nMJGgJNt5VXqkR=crtdqxfmuyzwef23-fd5nusns+6n...@mail.gmail.com but I cannot help asking some questions: 1) Why limit the feature to UTF8 strings? Shouldn't the technique work for all multibyte server encodings? 2) There is probably something that makes this necessary, but why should the decision how toast is sliced be attached to the data type? My (probably naive) idea would be to add a new TOAST strategy (e.g. SLICED) to PLAIN, MAIN, EXTERNAL and EXTENDED. The feature only makes sense for string data types, right? Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
(2014/03/03 16:51), Fabien COELHO wrote:\setrandom foo 1 10 [uniform] \setrandom foo 1 :size gaussian 3.6 \setrandom foo 1 100 exponential 7.2 It's good design. I think it will become more low overhead at part of parsing in pgbench, because comparison of strings will be redeced(maybe). And I'd like to remove [uniform], beacause we have to have compatibility for old scripts, and random function always gets uniform distribution in common sense of programming. I just put uniform as an optional default, hence the brackets. All right. I was misunderstanding. However, if we select this format, I'd like to remove it. Because pgbench needs to check counts of argment number. If we allow brackets, it will not be simple. Otherwise, what I would have in mind if this would be designed from scratch: \set foo 124 \set foo string value (?) \set foo :variable \set foo 12 + :shift And then \set foo uniform 1 10 \set foo gaussian 1 10 4.2 \set foo exponential 1 100 5.2 or maybe functions could be repended with something like uniform. But that would be for another life:-) I don't agree with that.. They are more overhead in parsing part and more complex for user. However, new grammer is little bit long in user script. It seems trade-off that are visibility of scripts and user writing cost. Yep. OK. I'm not sure which idia is the best. So I wait for comments in community:) Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode
On Mon, Mar 3, 2014 at 1:13 PM, Andrew Dunstan and...@dunslane.net wrote: That difference actually made the file_fdw regression results plain wrong, in my view, in that they expected a quoted empty string to be turned to null even when the null string was something else. I've adjusted this and the docs and propose to apply the attached patch in the next day or two unless there are any objections. Unless I'm overlooking something, output from SELECT * FROM text_csv; in 'output/file_fdw.source' still needs updating? While reading this patch, I found a small typo in copy2.[sql|out] = s/violiaton/violation/g Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heapgetpage() and -takenDuringRecovery
On Sun, Mar 2, 2014 at 8:39 AM, Andres Freund and...@2ndquadrant.com wrote: While reading around which references to SnapshotData's members exist, I once more came about the following tidbit in heapgetpage(): /* * If the all-visible flag indicates that all tuples on the page are * visible to everyone, we can skip the per-tuple visibility tests. * * Note: In hot standby, a tuple that's already visible to all * transactions in the master might still be invisible to a read-only * transaction in the standby. We partly handle this problem by tracking * the minimum xmin of visible tuples as the cut-off XID while marking a * page all-visible on master and WAL log that along with the visibility * map SET operation. In hot standby, we wait for (or abort) all * transactions that can potentially may not see one or more tuples on the * page. That's how index-only scans work fine in hot standby. A crucial * difference between index-only scans and heap scans is that the * index-only scan completely relies on the visibility map where as heap * scan looks at the page-level PD_ALL_VISIBLE flag. We are not sure if * the page-level flag can be trusted in the same way, because it might * get propagated somehow without being explicitly WAL-logged, e.g. via a * full page write. Until we can prove that beyond doubt, let's check each * tuple for visibility the hard way. */ all_visible = PageIsAllVisible(dp) !snapshot-takenDuringRecovery; I don't think this is neccessary = 9.2. The are two only interestings place where PD_ALL_VISIBLE is set: a) lazy_vacuum_page() where a xl_heap_clean is logged *before* PD_ALL_VISIBLE/the vm is touched and that causes recovery conflicts. The heap page is locked for cleanup at that point. As the logging of xl_heap_clean sets the page's LSN there's no way the page can appear on the standby too early. b) empty pages in lazy_scan_heap(). If they always were empty, there's no need for conflicts. The only other way I can see to end up there is a previous heap_page_prune() that repaired fragmentation. But that logs a WAL record with conflict information. I don't think there's any reason to believe that lazy_scan_heap() can only hit pages that are empty or have just been defragged. Suppose that there's a tuple on the page which was recently inserted; the inserting transaction has committed but there are some backends that still have older snapshots. The page won't be marked all-visible because it isn't. Now, eventually those older snapshots will go away, and sometime after that the relation will get vacuumed again, and we'll once again look the page. But this time we notice that it is all-visible, and mark it so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Defining macro LSNOID for pg_lsn in pg_type.h
On Thu, Feb 27, 2014 at 8:12 AM, Michael Paquier michael.paqu...@gmail.com wrote: When working on the datatype pg_lsn, we actually did not create a define macro for its oid in pg_type.h and this could be useful for extension developers. The simple patch attached corrects that by naming this macro LSNOID. Thanks, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heapgetpage() and -takenDuringRecovery
On 2014-03-03 06:57:00 -0500, Robert Haas wrote: On Sun, Mar 2, 2014 at 8:39 AM, Andres Freund and...@2ndquadrant.com wrote: While reading around which references to SnapshotData's members exist, I once more came about the following tidbit in heapgetpage(): /* * If the all-visible flag indicates that all tuples on the page are * visible to everyone, we can skip the per-tuple visibility tests. * * Note: In hot standby, a tuple that's already visible to all * transactions in the master might still be invisible to a read-only * transaction in the standby. We partly handle this problem by tracking * the minimum xmin of visible tuples as the cut-off XID while marking a * page all-visible on master and WAL log that along with the visibility * map SET operation. In hot standby, we wait for (or abort) all * transactions that can potentially may not see one or more tuples on the * page. That's how index-only scans work fine in hot standby. A crucial * difference between index-only scans and heap scans is that the * index-only scan completely relies on the visibility map where as heap * scan looks at the page-level PD_ALL_VISIBLE flag. We are not sure if * the page-level flag can be trusted in the same way, because it might * get propagated somehow without being explicitly WAL-logged, e.g. via a * full page write. Until we can prove that beyond doubt, let's check each * tuple for visibility the hard way. */ all_visible = PageIsAllVisible(dp) !snapshot-takenDuringRecovery; I don't think this is neccessary = 9.2. The are two only interestings place where PD_ALL_VISIBLE is set: a) lazy_vacuum_page() where a xl_heap_clean is logged *before* PD_ALL_VISIBLE/the vm is touched and that causes recovery conflicts. The heap page is locked for cleanup at that point. As the logging of xl_heap_clean sets the page's LSN there's no way the page can appear on the standby too early. b) empty pages in lazy_scan_heap(). If they always were empty, there's no need for conflicts. The only other way I can see to end up there is a previous heap_page_prune() that repaired fragmentation. But that logs a WAL record with conflict information. I don't think there's any reason to believe that lazy_scan_heap() can only hit pages that are empty or have just been defragged. Suppose that there's a tuple on the page which was recently inserted; the inserting transaction has committed but there are some backends that still have older snapshots. The page won't be marked all-visible because it isn't. Now, eventually those older snapshots will go away, and sometime after that the relation will get vacuumed again, and we'll once again look the page. But this time we notice that it is all-visible, and mark it so. Hm, right, that can happen. How about adding a LSN interlock in visibilitymap_set() for those cases as well, not just for checksums? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype
On Thu, Feb 27, 2014 at 8:05 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Feb 26, 2014 at 5:45 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas escribió: On Mon, Feb 24, 2014 at 9:34 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Yeah, erroring out seems good enough. Particularly if you add a hint saying please upgrade the extension. In past instances where this has come up, we have actually made the .so backward-compatible. See pg_stat_statements in particular. I'd prefer to keep to that precedent here. But pg_stat_statement is a user tool which is expected to have lots of use, and backwards compatibility concerns because of people who might have written tools on top of it. Not so with pageinspect. I don't think we need to put in the same amount of effort. (Even though, really, it's probably not all that difficult to support both cases. I just don't see the point.) Actually a little bit of hacking I noticed that supporting both is as complicated as supporting only pg_lsn... Here is for example how I can get page_header to work across versions: - snprintf(lsnchar, sizeof(lsnchar), %X/%X, -(uint32) (lsn 32), (uint32) lsn); - values[0] = CStringGetTextDatum(lsnchar); + /* +* Do some version-related checks. pageinspect = 1.2 uses pg_lsn +* instead of text when using this function for the LSN field. +*/ + if (tupdesc-attrs[0]-atttypid == TEXTOID) + { + charlsnchar[64]; + snprintf(lsnchar, sizeof(lsnchar), %X/%X, +(uint32) (lsn 32), (uint32) lsn); + values[0] = CStringGetTextDatum(lsnchar); + } + else + values[0] = LSNGetDatum(lsn); In this case an upgraded 9.4 server using pageinspect 1.1 or older simply goes through the text datatype path... You can find that in the patch attached. Thanks, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heapgetpage() and -takenDuringRecovery
On Mon, Mar 3, 2014 at 7:07 AM, Andres Freund and...@2ndquadrant.com wrote: I don't think there's any reason to believe that lazy_scan_heap() can only hit pages that are empty or have just been defragged. Suppose that there's a tuple on the page which was recently inserted; the inserting transaction has committed but there are some backends that still have older snapshots. The page won't be marked all-visible because it isn't. Now, eventually those older snapshots will go away, and sometime after that the relation will get vacuumed again, and we'll once again look the page. But this time we notice that it is all-visible, and mark it so. Hm, right, that can happen. How about adding a LSN interlock in visibilitymap_set() for those cases as well, not just for checksums? Well, if I'm correctly understanding what you're proposing, that would involve emitting an FPI for each page when we vacuum the newly-inserted portion of an insert-only table. That's been repeatedly proposed in the past, but I've opposed it on the grounds that it makes vacuum much more expensive in such cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Defining macro LSNOID for pg_lsn in pg_type.h
On Mon, Mar 3, 2014 at 9:06 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Feb 27, 2014 at 8:12 AM, Michael Paquier michael.paqu...@gmail.com wrote: When working on the datatype pg_lsn, we actually did not create a define macro for its oid in pg_type.h and this could be useful for extension developers. The simple patch attached corrects that by naming this macro LSNOID. Thanks, committed. Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VACUUM FULL/CLUSTER doesn't update pg_class's pg_class.relfrozenxid
On Thu, Feb 27, 2014 at 1:06 PM, Andres Freund and...@2ndquadrant.com wrote: As Robert previously complained a database wide VACUUM FULL now (as of 3cff1879f8d03) reliably increases the relfrozenxid for all tables but pg_class itself. That's a bit sad because it means doing a VACUUM FULL won't help in a anti-wraparound scenario. The reason for that is explained by the following comment: /* * Update the tuples in pg_class --- unless the target relation of the * swap is pg_class itself. In that case, there is zero point in making * changes because we'd be updating the old data that we're about to throw * away. Because the real work being done here for a mapped relation is * just to change the relation map settings, it's all right to not update * the pg_class rows in this case. */ I think the easiest fix for that is to update pg_class' relfrozenxid in finish_heap_swap() after the indexes have been rebuilt, that's just a couple of lines. There's more complex solutions that'd avoid the need for that special case, but I it's sufficient. A patch doing that is attached. So, this patch is obviously after the final CommitFest deadline, but I'd like to commit it to 9.4 anyway on admittedly-arguable theory that it's tying up a loose end introduced by 3cff1879f8d03cb729368722ca823a4bf74c0cac. Prior to that commit, VACUUM FULL and CLUSTER *never* updated relfrozenxid; beginning with that commit, they do so for all relations except pg_class. This tidies up that omission. And I think that's a pretty worthwhile thing to do, because we get periodic reports from people who have run VACUUM FULL on a database in danger of wraparound and then wondered why it did not fix the problem. The previously-mentioned commit did most of the heavy lifting as far as tidying that up, but without this adjustment it won't quite get us over the hump. But all that having been said, a deadline is a deadline, so if anyone wishes to declare this untimely please speak up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: VACUUM FULL/CLUSTER doesn't update pg_class's pg_class.relfrozenxid
On 2014-03-03 07:52:23 -0500, Robert Haas wrote: And I think that's a pretty worthwhile thing to do, because we get periodic reports from people who have run VACUUM FULL on a database in danger of wraparound and then wondered why it did not fix the problem. The previously-mentioned commit did most of the heavy lifting as far as tidying that up, but without this adjustment it won't quite get us over the hump. But all that having been said, a deadline is a deadline, so if anyone wishes to declare this untimely please speak up. Now, I am obviously not impartial here, but I think it doesn't make sense to whack this code around in two releases if not necessary. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
On 03/03/2014 02:00 AM, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: The only way I can see this being of real use to an attacker is if they could use this exploit to create a wormed version of PostgresQL on the target build system. Is that possible? It's theoretically possible, since having broken into the build user's account they could modify the already-built-but-not-yet-packaged PG executables. Having said that, though, I concur with the feeling that this probably isn't a useful exploit in practice. On Red Hat's build systems, for example, different packages are built in different chroots. So even if a malicious package is being built concurrently, it could not reach the postmaster's socket. A breakin would only be possible for somebody who had outside-the-chroots control of the build machine ... in which case they can hack pretty much any built package pretty much any way they want, without need for anything as fiddly as this. Other vendors might do things differently, but it still seems likely that there would be easier exploits available to anyone who's managed to get control on a machine used for package building. I'm less worried about vendor build systems and more about roll your own systems like Gentoo, FreeBSD ports, and Homebrew. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heapgetpage() and -takenDuringRecovery
On 2014-03-03 06:57:00 -0500, Robert Haas wrote: On Sun, Mar 2, 2014 at 8:39 AM, Andres Freund and...@2ndquadrant.com wrote: I don't think this is neccessary = 9.2. The are two only interestings place where PD_ALL_VISIBLE is set: a) lazy_vacuum_page() where a xl_heap_clean is logged *before* PD_ALL_VISIBLE/the vm is touched and that causes recovery conflicts. The heap page is locked for cleanup at that point. As the logging of xl_heap_clean sets the page's LSN there's no way the page can appear on the standby too early. b) empty pages in lazy_scan_heap(). If they always were empty, there's no need for conflicts. The only other way I can see to end up there is a previous heap_page_prune() that repaired fragmentation. But that logs a WAL record with conflict information. I don't think there's any reason to believe that lazy_scan_heap() can only hit pages that are empty or have just been defragged. Suppose that there's a tuple on the page which was recently inserted; the inserting transaction has committed but there are some backends that still have older snapshots. The page won't be marked all-visible because it isn't. Now, eventually those older snapshots will go away, and sometime after that the relation will get vacuumed again, and we'll once again look the page. But this time we notice that it is all-visible, and mark it so. Right now I am missing how this isn't an actual correctness problem after a crash. Without an LSN interlock we could crash *after* the heap page has been written out, but *before* the vm WAL record has been flushed to disk. Combined with synchronous_commit=off there could be transactions that appeared as safely committed for vacuum (i.e. are below GetOldestXmin()), but which are actually aborted after the commit. Normal hint bits circumvent that by checking XLogNeedsFlush(commitLSN), but that doesn't work here. Am I missing something? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
I tried to check the latest (v8) patch again, then could not find problem in your design change from the v7. As Noah pointed out, it uses per query-depth tuplestore being released on AfterTriggerEndSubXact. So, may I mark it as ready for committer? 2014-03-03 15:48 GMT+09:00 Ronan Dunklau ronan.dunk...@dalibo.com: Hello. Did you have time to review the latest version of this patch ? Is there anything I can do to get this ready for commiter ? Thank you for all the work performed so far. Le mardi 4 février 2014 13:16:22 Ronan Dunklau a écrit : Le lundi 3 février 2014 23:28:45 Noah Misch a écrit : On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit : On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: What do you think about this approach ? Is there something I missed which would make it not sustainable ? Seems basically reasonable. I foresee multiple advantages from having one tuplestore per query level as opposed to one for the entire transaction. You would remove the performance trap of backing up the tuplestore by rescanning. It permits reclaiming memory and disk space in AfterTriggerEndQuery() rather than at end of transaction. You could remove ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or the next two tuples from the tuplestore. Using work_mem per AfterTriggerBeginQuery() instead of per transaction is no problem. What do you think of that design change? I agree that this design is better, but I have some objections. We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't go away. Consider for example the case of a foreign table with more than one AFTER UPDATE triggers. Unless we store the tuples once for each trigger, we will have to rescan the tuplestore. Will we? Within a given query level, when do (non-deferred) triggers execute in an order other than the enqueue order? Let me explain what I had in mind. Looking at the code in AfterTriggerSaveEvent: - we build a template AfterTriggerEvent, and store the tuple(s) - for each suitable after trigger that matches the trigger type, as well as the WHEN condition if any, a copy of the previously built AfterTriggerEvent is queued Later, those events are fired in order. This means that more than one event can be fired for one tuple. Take this example: CREATE TRIGGER trig_row_after1 AFTER UPDATE ON rem2 FOR EACH ROW WHEN (NEW.f1 % 5 3) EXECUTE PROCEDURE trigger_func('TRIG1'); CREATE TRIGGER trig_row_after2 AFTER UPDATE ON rem2 FOR EACH ROW WHEN (NEW.f1 % 5 4) EXECUTE PROCEDURE trigger_func('TRIG2'); UPDATE rem2 set f2 = 'something'; Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's ate_tupleindex will be, in that order. Ass 0-0-2-2-4-8-8 So, at least a backward seek is required for trig_row_after2 to be able to retrieve a tuple that was already consumed when firing trig_row_after1. On a side note, this made me realize that it is better to avoid storing a tuple entirely if there is no enabled trigger (the f1 = 4 case above). The attached patch does that, so the previous sequence becomes: 0-0-2-2-4-6-6 It also prevents from initalizing a tuplestore at all if its not needed. To mitigate the effects of this behaviour, I added the option to perform a reverse_seek when the looked-up tuple is nearer from the current index than from the start. If there's still a need to seek within the tuplestore, that should get rid of the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate the need to seek entirely. I think the only case when seeking is still needed is when there are more than one after trigger that need to be fired, since the abovementioned change prevents from seeking to skip tuples. If you do pursue that change, make sure the code still does the right thing when it drops queued entries during subxact abort. I don't really understand what should be done at that stage. Since triggers on foreign tables are not allowed to be deferred, everything should be cleaned up at the end of each query, right ? So, there shouldn't be any queued entries. I suspect that's right. If you haven't looked over AfterTriggerEndSubXact(), please do so and ensure all its actions still make sense in the context of this new kind of trigger storage. You're right, I missed something here. When aborting a subxact, the tuplestores for queries below the subxact query depth should be cleaned, if any, because AfterTriggerEndQuery has not been called for the failing query. The attached patch fixes that. The attached
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On 02/16/2014 01:51 PM, Amit Kapila wrote: On Wed, Feb 5, 2014 at 5:29 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: I'm pretty sure the overhead of that would be negligible, so we could always enable it. There are certainly a lot of scenarios where prefix/suffix detection alone wouldn't help, but so what. Attached is a quick patch for that, if you want to test it. I have updated the patch to correct few problems, addressed review comments by Andres and done some optimizations to improve CPU overhead for worst case. Thanks. I have to agree with Robert though that using the pglz encoding when we're just checking for a common prefix/suffix is a pretty crappy way of going about it [1]. As the patch stands, it includes the NULL bitmap when checking for a common prefix. That's probably not a good idea, because it defeats the prefix detection in a the common case that you update a field from NULL to not-NULL or vice versa. Attached is a rewritten version, which does the prefix/suffix tests directly in heapam.c, and adds the prefix/suffix lengths directly as fields in the WAL record. If you could take one more look at this version, to check if I've missed anything. This ought to be tested with the new logical decoding stuff as it modified the WAL update record format which the logical decoding stuff also relies, but I don't know anything about that. [1] http://www.postgresql.org/message-id/ca+tgmozstdqdku7dhcljchvmbrh1_yfouse+fuxesevnc4j...@mail.gmail.com - Heikki diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index de4befa..cf23db5 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -6594,10 +6594,15 @@ log_heap_update(Relation reln, Buffer oldbuf, xl_heap_header_len xlhdr; xl_heap_header_len xlhdr_idx; uint8 info; + uint16 prefix_suffix[2]; + uint16 prefixlen = 0, +suffixlen = 0; XLogRecPtr recptr; - XLogRecData rdata[7]; + XLogRecData rdata[9]; Page page = BufferGetPage(newbuf); bool need_tuple_data = RelationIsLogicallyLogged(reln); + int nr; + Buffer newbufref; /* Caller should not call me on a non-WAL-logged relation */ Assert(RelationNeedsWAL(reln)); @@ -6607,6 +6612,54 @@ log_heap_update(Relation reln, Buffer oldbuf, else info = XLOG_HEAP_UPDATE; + /* + * If the old and new tuple are on the same page, we only need to log + * the parts of the new tuple that were changed. That saves on the amount + * of WAL we need to write. Currently, we just count any unchanged bytes + * in the beginning or end of the tuples. That's quick to check, and + * perfectly covers the common case that only one field is updated. + * + * We could do this even if the old and new tuple are on different pages, + * but only if we don't make a full-page image of the old page, which is + * difficult to know in advance. Also, if the old tuple is corrupt for + * some reason, it would allow propagate the corruption to the new page, + * so it seems best to avoid. Under the general assumption that for long + * runs most updates tend to create new tuple version on same page, there + * should not be significant impact on WAL reduction or performance. + * + * Skip this if we're taking a full-page image of the new page, as we don't + * include the new tuple in the WAL record in that case. + */ + if (oldbuf == newbuf (need_tuple_data || !XLogCheckBufferNeedsBackup(newbuf))) + { + char *oldp = (char *) oldtup-t_data + oldtup-t_data-t_hoff; + char *newp = (char *) newtup-t_data + newtup-t_data-t_hoff; + int oldlen = oldtup-t_len - oldtup-t_data-t_hoff; + int newlen = newtup-t_len - newtup-t_data-t_hoff; + + /* Check for common prefix between old and new tuple */ + for (prefixlen = 0; prefixlen Min(oldlen, newlen); prefixlen++) + { + if (newp[prefixlen] != oldp[prefixlen]) +break; + } + /* + * Storing the length of the prefix takes 2 bytes, so we need to save + * at least 3 bytes or there's no point. + */ + if (prefixlen 3) + prefixlen = 0; + + /* Same for suffix */ + for (suffixlen = 0; suffixlen Min(oldlen, newlen) - prefixlen; suffixlen++) + { + if (newp[newlen - suffixlen - 1] != oldp[oldlen - suffixlen - 1]) +break; + } + if (suffixlen 3) + suffixlen = 0; + } + xlrec.target.node = reln-rd_node; xlrec.target.tid = oldtup-t_self; xlrec.old_xmax = HeapTupleHeaderGetRawXmax(oldtup-t_data); @@ -6619,41 +6672,119 @@ log_heap_update(Relation reln, Buffer oldbuf, xlrec.newtid = newtup-t_self; if (new_all_visible_cleared) xlrec.flags |= XLOG_HEAP_NEW_ALL_VISIBLE_CLEARED; + if (prefixlen 0) + xlrec.flags |= XLOG_HEAP_PREFIX_FROM_OLD; + if (suffixlen 0) + xlrec.flags |= XLOG_HEAP_SUFFIX_FROM_OLD; - rdata[0].data = (char *) xlrec; - rdata[0].len = SizeOfHeapUpdate; - rdata[0].buffer = InvalidBuffer; + /* If new tuple is the single and first tuple on page... */ + if (ItemPointerGetOffsetNumber((newtup-t_self)) ==
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On 2014-03-03 16:27:05 +0200, Heikki Linnakangas wrote: Thanks. I have to agree with Robert though that using the pglz encoding when we're just checking for a common prefix/suffix is a pretty crappy way of going about it [1]. As the patch stands, it includes the NULL bitmap when checking for a common prefix. That's probably not a good idea, because it defeats the prefix detection in a the common case that you update a field from NULL to not-NULL or vice versa. Attached is a rewritten version, which does the prefix/suffix tests directly in heapam.c, and adds the prefix/suffix lengths directly as fields in the WAL record. If you could take one more look at this version, to check if I've missed anything. Have you rerun the benchmarks? I'd guess the CPU overhead of this version is lower than earlier versions, but seing it tested won't be a bad idea. This ought to be tested with the new logical decoding stuff as it modified the WAL update record format which the logical decoding stuff also relies, but I don't know anything about that. Hm, I think all it needs to do disable delta encoding if need_tuple_data (which is dependent on wal_level=logical). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote: Well, the jsonb portion of this is arguably the most ready, certainly it's had a lot more on-list review. Having crossread both patches I tend to agree with this. I don't think it's unrealistic to get jsonb committable, but the hstore bits are another story. hm, do you have any specific concerns/objections about hstore? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 2014-03-03 08:57:59 -0600, Merlin Moncure wrote: On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote: Well, the jsonb portion of this is arguably the most ready, certainly it's had a lot more on-list review. Having crossread both patches I tend to agree with this. I don't think it's unrealistic to get jsonb committable, but the hstore bits are another story. hm, do you have any specific concerns/objections about hstore? I've listed a fair number in various emails, some have been addressed since I think. But just take a look at the patch, at least last when I looked, it was simply far from ready. And it's quite a bit of code, so it's not something that can be addressed within 5min. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Mon, Mar 03, 2014 at 11:10:30PM +0900, Kohei KaiGai wrote: I tried to check the latest (v8) patch again, then could not find problem in your design change from the v7. As Noah pointed out, it uses per query-depth tuplestore being released on AfterTriggerEndSubXact. So, may I mark it as ready for committer? Yes. Re-reviewing this is next on my list. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Andres, you can always look at our development repository: https://github.com/feodor/postgres/tree/hstore - hstore only, https://github.com/feodor/postgres/tree/jsonb_and_hstore - hstore with jsonb Since we were concentrated on the jsonb_and_hstore branch we usually wait Andrew, who publish patch. You last issues were addressed in both branches. Oleg PS. We are not native-english and may not well inderstand your criticism well, but please try to be a little bit polite. We are working together and our common goal is to make postgres better. Your notes are very important for quality of postgres, but sometimes you drive us ... On Mon, Mar 3, 2014 at 7:00 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-03 08:57:59 -0600, Merlin Moncure wrote: On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote: Well, the jsonb portion of this is arguably the most ready, certainly it's had a lot more on-list review. Having crossread both patches I tend to agree with this. I don't think it's unrealistic to get jsonb committable, but the hstore bits are another story. hm, do you have any specific concerns/objections about hstore? I've listed a fair number in various emails, some have been addressed since I think. But just take a look at the patch, at least last when I looked, it was simply far from ready. And it's quite a bit of code, so it's not something that can be addressed within 5min. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Removing SELECT privilege while running a SELECT would be a different matter. This is all a matter of definition; we can make up any rules we like. Doing so is IMHO a separate patch and not something to hold up the main patch. So I think this is an interesting point. There are various things that could go wrong as a result of using the wrong lock level. Worst would be that the server crashes or corrupts data. A little less bad would be that sessions error out with inexplicable error conditions, as in SnapshotNow days. Alternatively, we could just have arguably wrong behavior, like basing query results on the old version of the table's metadata even after it's been changed. I don't really care about that second category of behavior. If somebody changes some property of a table and existing sessions continue to use the old value until eoxact, well, we can argue about that, but at least until we have concrete reports of really undesirable behavior, I don't think it's the primary issue. What I'm really concerned about is whether there are other things like the SnapshotNow issues that can cause stuff to halt and catch fire. I don't know whether there are or are not, but that's my concern. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Hi Oleg, On 2014-03-03 19:17:12 +0400, Oleg Bartunov wrote: Since we were concentrated on the jsonb_and_hstore branch we usually wait Andrew, who publish patch. You last issues were addressed in both branches. I'll try to have look sometime soon. We are not native-english and may not well inderstand your criticism well, but please try to be a little bit polite. We are working together and our common goal is to make postgres better. Your notes are very important for quality of postgres, but sometimes you drive us ... I am sorry if I came over as impolite. I just tried to point at things I thought needed improvement, and imo there were quite some. A patch needing polishing isn't something that carries shame, blame or anything. It's just a state a patch can be in. Greetings, Andres Freund PS: Not a native speaker either... -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 3, 2014 at 7:22 PM, Andres Freund and...@2ndquadrant.com wrote: Hi Oleg, On 2014-03-03 19:17:12 +0400, Oleg Bartunov wrote: Since we were concentrated on the jsonb_and_hstore branch we usually wait Andrew, who publish patch. You last issues were addressed in both branches. I'll try to have look sometime soon. We are not native-english and may not well inderstand your criticism well, but please try to be a little bit polite. We are working together and our common goal is to make postgres better. Your notes are very important for quality of postgres, but sometimes you drive us ... I am sorry if I came over as impolite. I just tried to point at things I thought needed improvement, and imo there were quite some. A patch needing polishing isn't something that carries shame, blame or anything. It's just a state a patch can be in. We have not so much time to go deep onto 100th messages threads and sometimes just lost directions. Greetings, Andres Freund PS: Not a native speaker either... That's explain all :) -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: I don't see that parallelizing Append is any easier than any other problem in this space. There's no parallel I/O facility, so you need a background worker per append branch to wait on I/O. And you have all the problems of making sure that the workers have the same snapshot, making sure they don't self-deadlock, etc. that you have for any other case. Erm, my thought was to use a select() loop which sends out I/O requests and then loops around waiting to see who finishes it. It doesn't parallelize the CPU cost of getting the rows back to the caller, but it'd at least parallelize the I/O, and if what's underneath is actually a remote FDW running a complex query (because the other side is actually a view), it would be a massive win to have all the remote FDWs executing concurrently instead of serially as we have today. I can't really make sense of this. In general, what's under each branch of an append node is an arbitrary plan tree, and the only operation you can count on being able to do for each is get me the next tuple (i.e. ExecProcNode). Append has no idea whether that involves disk I/O or for what blocks. But even if it did, there's no standard API for issuing an asynchronous read(), which is how we get blocks into shared buffers. We do have an API for requesting the prefetch of a block on platforms with posix_fadvise(), but can't find out whether it's completed using select(), and even if you could you still have to do the actual read() afterwards. For FDWs, one idea might be to kick off the remote query at ExecInitNode() time rather than ExecProcNode() time, at least if the remote query doesn't depend on parameters that aren't available until run time. That actually would allow multiple remote queries to run simultaneously or in parallel with local work. It would also run them in cases where the relevant plan node is never executed, which would be bad but perhaps rare enough not to worry about. Or we could add a new API like ExecPrefetchNode() that tells nodes to prepare to have tuples pulled, and they can do things like kick off asynchronous queries. But I still don't see any clean way for the Append node to find out which one is ready to return results first. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] commit fest status and release timeline
On 1.3.2014 18:01, Peter Eisentraut wrote: Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4. Total: 114. We're still on track to achieve about 50% committed patches, which would be similar to the previous few commit fests. So decent job so far. I'm wondering what is the best way to select a patch to review. I mean, there are many patches with needs review (and often no reviewer) just one or two comments, but when I checked the email archives there's often a lot people discussing it. Do we have a list of patches that didn't get a proper review yet / badly need another one? What about improving the commitfest page by displaying a number of related e-mail messages / number of people involved? Shouldn't be difficult to get this from the mail archives ... regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On 2014-03-03 10:35:03 -0500, Robert Haas wrote: On Mon, Mar 3, 2014 at 9:57 AM, Andres Freund and...@2ndquadrant.com wrote: Hm, I think all it needs to do disable delta encoding if need_tuple_data (which is dependent on wal_level=logical). Why does it need to do that? The logical decoding stuff should be able to reverse out the delta encoding. Against what should it perform the delta? Unless I misunderstand how the patch works, it computes the delta against the old tuple in the heap page? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request improve pg_stat_statements module
On Fri, Feb 28, 2014 at 8:01 AM, Michael Paquier michael.paqu...@gmail.com wrote: Thanks for your patch! On Fri, Feb 28, 2014 at 4:18 PM, pgsql...@postgresql.kr wrote: I patched to add one column in pg_stat_statements module. and sent to author but I need a last time of query, because I want to analyse order by recent time. Hm... I am not sure that this brings much to pg_stat_statements, which is interested to gather normalized information about the queries run on the server. For example, things like calculating the average time of a query by using total_time/calls or even the total time to guess where is an application bottleneck is interesting on a busy system, while the latest timestamp is not really an information that can be used for query tuning. Getting the latest timestamp when a query has run is particularly not interesting on OLTP-like applications where many short transactions are running. It is more interesting to classify query results by either calls, total_time or the combination of both IMO. FWIW, I think it's a neat idea, but I think this proposal is probably too late for 9.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On Mon, Mar 3, 2014 at 9:57 AM, Andres Freund and...@2ndquadrant.com wrote: Hm, I think all it needs to do disable delta encoding if need_tuple_data (which is dependent on wal_level=logical). Why does it need to do that? The logical decoding stuff should be able to reverse out the delta encoding. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output
On 27-02-2014 21:10, Wang, Jing wrote: Using pg_dump can dump the data into the file with format set to be 'c','t' or plain text. In the existing version the version of server pg_dump is already there when the format of file is 'c' or 't'. And even for the plain text format file the version of server pg_dump is already there if using '--verbose' in pg_dump. Using '--verbose' leads to some many other prints which are not required always. I don't buy your argument. Why isn't verbose option sufficient? Did you read the old thread about this [1]? AFAICS a lot of people compare pg_dump diffs. If we apply this patch, it would break those applications. Also, it is *already* available if you add verbose option (which is sufficient to satisfy those that want the client and/or server version) in plain mode (the other modes already include the desired info by default). In the past, timestamps were removed to avoid noise in diffs. [1] http://www.postgresql.org/message-id/3677.1253912...@sss.pgh.pa.us -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Robert Haas (robertmh...@gmail.com) wrote: On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote: Erm, my thought was to use a select() loop which sends out I/O requests and then loops around waiting to see who finishes it. It doesn't parallelize the CPU cost of getting the rows back to the caller, but it'd at least parallelize the I/O, and if what's underneath is actually a remote FDW running a complex query (because the other side is actually a view), it would be a massive win to have all the remote FDWs executing concurrently instead of serially as we have today. I can't really make sense of this. Sorry, that was a bit hand-wavey since I had posted about it previously here: http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net It'd clearly be more involved than just build a select() loop and would require adding an async mechanism. I had been thinking about this primairly with the idea of FDWs and you're right that it'd require more thought to deal with getting data into/through shared_buffers. Still, we seqscan into a ring buffer, I'd think we could make it work but it would require additional work. For FDWs, one idea might be to kick off the remote query at ExecInitNode() time rather than ExecProcNode() time, at least if the remote query doesn't depend on parameters that aren't available until run time. Right, I had speculated about that also (option #2 in my earlier email). That actually would allow multiple remote queries to run simultaneously or in parallel with local work. It would also run them in cases where the relevant plan node is never executed, which would be bad but perhaps rare enough not to worry about. This was my primary concern, along with the fact that we explicitly says don't do that in the docs for the FDW API. Or we could add a new API like ExecPrefetchNode() that tells nodes to prepare to have tuples pulled, and they can do things like kick off asynchronous queries. But I still don't see any clean way for the Append node to find out which one is ready to return results first. Yeah, that's tricky. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 3 March 2014 15:19, Robert Haas robertmh...@gmail.com wrote: On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs si...@2ndquadrant.com wrote: What I'm really concerned about is whether there are other things like the SnapshotNow issues that can cause stuff to halt and catch fire. I don't know whether there are or are not, but that's my concern. Of course its a concern, I feel it also. But that's why we have beta period to handle the unknowns. The question is are there any specific areas of concern here? If not, then we commit because we've done a lot of work on it and at the moment the balance is high benefit to users against a non-specific feeling of risk. @Noah - Last call... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heapgetpage() and -takenDuringRecovery
On Mon, Mar 3, 2014 at 8:33 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-03 06:57:00 -0500, Robert Haas wrote: On Sun, Mar 2, 2014 at 8:39 AM, Andres Freund and...@2ndquadrant.com wrote: I don't think this is neccessary = 9.2. The are two only interestings place where PD_ALL_VISIBLE is set: a) lazy_vacuum_page() where a xl_heap_clean is logged *before* PD_ALL_VISIBLE/the vm is touched and that causes recovery conflicts. The heap page is locked for cleanup at that point. As the logging of xl_heap_clean sets the page's LSN there's no way the page can appear on the standby too early. b) empty pages in lazy_scan_heap(). If they always were empty, there's no need for conflicts. The only other way I can see to end up there is a previous heap_page_prune() that repaired fragmentation. But that logs a WAL record with conflict information. I don't think there's any reason to believe that lazy_scan_heap() can only hit pages that are empty or have just been defragged. Suppose that there's a tuple on the page which was recently inserted; the inserting transaction has committed but there are some backends that still have older snapshots. The page won't be marked all-visible because it isn't. Now, eventually those older snapshots will go away, and sometime after that the relation will get vacuumed again, and we'll once again look the page. But this time we notice that it is all-visible, and mark it so. Right now I am missing how this isn't an actual correctness problem after a crash. Without an LSN interlock we could crash *after* the heap page has been written out, but *before* the vm WAL record has been flushed to disk. Yes. In that case, the PD_ALL_VISIBLE bit will be set on the page, but the corresponding visibility map bit will be unset. Combined with synchronous_commit=off there could be transactions that appeared as safely committed for vacuum (i.e. are below GetOldestXmin()), but which are actually aborted after the commit. Normal hint bits circumvent that by checking XLogNeedsFlush(commitLSN), but that doesn't work here. Well, we'd better not try to mark a page all-visible if it's only all-visible on the assumption that unwritten xlog will be successfully flushed to disk. But lazy_scan_heap() has code that only regards the tuple as all-visible once the tuple is hinted committed, and there's code elsewhere to keep hint bits from being set too early. And heap_page_is_all_visible() follows the same pattern. So I think it's OK, but maybe you see something I don't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On Mon, Mar 3, 2014 at 10:38 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-03 10:35:03 -0500, Robert Haas wrote: On Mon, Mar 3, 2014 at 9:57 AM, Andres Freund and...@2ndquadrant.com wrote: Hm, I think all it needs to do disable delta encoding if need_tuple_data (which is dependent on wal_level=logical). Why does it need to do that? The logical decoding stuff should be able to reverse out the delta encoding. Against what should it perform the delta? Unless I misunderstand how the patch works, it computes the delta against the old tuple in the heap page? Oh, maybe I need more caffeine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Simon Riggs si...@2ndquadrant.com writes: On 3 March 2014 15:19, Robert Haas robertmh...@gmail.com wrote: What I'm really concerned about is whether there are other things like the SnapshotNow issues that can cause stuff to halt and catch fire. I don't know whether there are or are not, but that's my concern. Of course its a concern, I feel it also. But that's why we have beta period to handle the unknowns. I have exactly zero faith that beta testing would catch low-probability problems in this area. What's needed, and hasn't happened AFAIK, is detailed study of the patch by assorted senior hackers. The question is are there any specific areas of concern here? If not, then we commit because we've done a lot of work on it and at the moment the balance is high benefit to users against a non-specific feeling of risk. This is backwards. The default decision around here has never been to commit when in doubt. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
On Mon, Mar 3, 2014 at 10:43 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote: Erm, my thought was to use a select() loop which sends out I/O requests and then loops around waiting to see who finishes it. It doesn't parallelize the CPU cost of getting the rows back to the caller, but it'd at least parallelize the I/O, and if what's underneath is actually a remote FDW running a complex query (because the other side is actually a view), it would be a massive win to have all the remote FDWs executing concurrently instead of serially as we have today. I can't really make sense of this. Sorry, that was a bit hand-wavey since I had posted about it previously here: http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net Huh, somehow I can't remember reading that... but I didn't think I had missed any posts, either. Evidently I did. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Sun, Mar 2, 2014 at 4:50 AM, Simon Riggs si...@2ndquadrant.com wrote: v20 includes slightly re-ordered checks in GetLockLevel, plus more detailed comments on each group of subcommands. Also corrects grammar as noted by Vik. Plus adds an example of usage to the docs. This patch contains a one line change to src/bin/pg_dump/pg_backup_archiver.c which seems not to belong. This hunk in ATRewriteCatalogs() looks scary: + /* +* If we think we might need to add/re-add toast tables then +* we currently need to hold an AccessExclusiveLock. +*/ + if (lockmode AccessExclusiveLock) + return; It would make sense to me to add an Assert() or elog() check inside the subsequent loop to verify that the lock level is adequate ... but just returning silently seems like a bad idea. I have my doubts about whether it's safe to do AT_AddInherit, AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those can change the tuple descriptor, and we discussed, back when we did this the first time, the fact that the executor may get *very* unhappy if the tuple descriptor changes in mid-execution. I strongly suspect these are unsafe with less than a full AccessExclusiveLock. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Robert Haas (robertmh...@gmail.com) wrote: http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net Huh, somehow I can't remember reading that... but I didn't think I had missed any posts, either. Evidently I did. You and everyone else- you'll note it got exactly zero responses.. Ah well. :) Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] GSoC on WAL-logging hash indexes
On Sun, Mar 2, 2014 at 2:38 PM, Tan Tran tankimt...@gmail.com wrote: 2. Proposal As a GSoC student, I will implement WAL recovery of hash indexes using the other index types' WAL code as a guide. Roughly, I will: - Devise a way to store and retrieve hashing data within the XLog data structures. - In the existing skeleton for hash_redo(XLogRecPtr lsn, XLogRecord *record) in hash.c, branch to code for the various redo operations: creating an index, inserting into an index, deleting an index, and page operations (split, delete, update?). - Code each branch by drawing on examples from btree_redo, gin_redo, and gist_redo, the existing XLog code of the other index types. Unfortunately, I don't believe that it's possible to do this easily today because of the way bucket splits are handled. I wrote about this previously here, with an idea for solving the problem: http://www.postgresql.org/message-id/ca+tgmozymojsrfxhxq06g8jhjxqcskvdihb_8z_7nc7hj7i...@mail.gmail.com Sadly, no one responded. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST support for inet datatypes
On Thu, Feb 27, 2014 at 12:30 PM, Tom Lane t...@sss.pgh.pa.us wrote: Florian Pflug f...@phlo.org writes: On Feb27, 2014, at 17:56 , Tom Lane t...@sss.pgh.pa.us wrote: That's not a bug, it's a feature, for much the same reasons that pg_dump tries to minimize explicit schema-qualification. I fail to see the value in this for opclasses. It's certainly nice for schema qualifications, because dumping one schema and restoring into a different schema is probably quite common. The value in it is roughly the same as the reason we don't include a version number when dumping CREATE EXTENSION. If you had a default opclass in the source database, you probably want a default opclass in the destination, even if that's not bitwise the same as what you had before. The implication is that you want best practice for the current PG version. I don't think that argument holds a lot of water in this instance. The whole reason for having multiple opclasses that each one can implement different comparison behavior. It's unlikely that you want an upgrade to change the comparison behavior you had before; you'd be sad if, for example, the dump/restore process failed to preserve your existing collation settings. But even if that were desirable in general, suppressing it for binary upgrade dumps certainly seems more than sane. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 3 March 2014 15:53, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On 3 March 2014 15:19, Robert Haas robertmh...@gmail.com wrote: What I'm really concerned about is whether there are other things like the SnapshotNow issues that can cause stuff to halt and catch fire. I don't know whether there are or are not, but that's my concern. Of course its a concern, I feel it also. But that's why we have beta period to handle the unknowns. I have exactly zero faith that beta testing would catch low-probability problems in this area. What's needed, and hasn't happened AFAIK, is detailed study of the patch by assorted senior hackers. The question is are there any specific areas of concern here? If not, then we commit because we've done a lot of work on it and at the moment the balance is high benefit to users against a non-specific feeling of risk. This is backwards. The default decision around here has never been to commit when in doubt. I'm not in doubt. If anybody can give me some more pointers of things to look at, I will. But I've looked and I can't see anything more. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GSoC proposal - make an unlogged table logged
Hi all, Is the TODO item make an unlogged table logged [1] a good GSoC project? Regards, [1] http://www.postgresql.org/message-id/aanlktinenzbrxdcwohkqbba2bhubfy8_c5jwrxlod...@mail.gmail.com -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] commit fest status and release timeline
On Mon, Mar 3, 2014 at 4:34 PM, Tomas Vondra t...@fuzzy.cz wrote: On 1.3.2014 18:01, Peter Eisentraut wrote: Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4. Total: 114. We're still on track to achieve about 50% committed patches, which would be similar to the previous few commit fests. So decent job so far. I'm wondering what is the best way to select a patch to review. I mean, there are many patches with needs review (and often no reviewer) just one or two comments, but when I checked the email archives there's often a lot people discussing it. Do we have a list of patches that didn't get a proper review yet / badly need another one? What about improving the commitfest page by displaying a number of related e-mail messages / number of people involved? Shouldn't be difficult to get this from the mail archives ... I have some code for that part, that needs a coupe of rounds of final hacking and polish. I've had many targets for it, but right now the target is to be done before pgcon, so we can put it in play for the next set of commitfests. It's not going to happen for *this* one, and we don't want to distrupt the flow even more by making big changes to the tooling in the middle of it. That said, there is definitely a need :) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 3 March 2014 16:06, Robert Haas robertmh...@gmail.com wrote: On Sun, Mar 2, 2014 at 4:50 AM, Simon Riggs si...@2ndquadrant.com wrote: v20 includes slightly re-ordered checks in GetLockLevel, plus more detailed comments on each group of subcommands. Also corrects grammar as noted by Vik. Plus adds an example of usage to the docs. This patch contains a one line change to src/bin/pg_dump/pg_backup_archiver.c which seems not to belong. This hunk in ATRewriteCatalogs() looks scary: + /* +* If we think we might need to add/re-add toast tables then +* we currently need to hold an AccessExclusiveLock. +*/ + if (lockmode AccessExclusiveLock) + return; It would make sense to me to add an Assert() or elog() check inside the subsequent loop to verify that the lock level is adequate ... but just returning silently seems like a bad idea. OK, I will check elog. I have my doubts about whether it's safe to do AT_AddInherit, AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those can change the tuple descriptor, and we discussed, back when we did this the first time, the fact that the executor may get *very* unhappy if the tuple descriptor changes in mid-execution. I strongly suspect these are unsafe with less than a full AccessExclusiveLock. I'm happy to change those if you feel there is insufficient evidence. I don't personally feel that it would matter to usability to keep locks for those at AccessExclusiveLock, especially since they are otherwise fast. Some others might be kept higher also. I'm merely trying to balance between requests to reduce to minimal theoretical level and fears that anything less than AccessExclusiveLock is a problem. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Mon, Mar 3, 2014 at 11:29 AM, Simon Riggs si...@2ndquadrant.com wrote: On 3 March 2014 16:06, Robert Haas robertmh...@gmail.com wrote: On Sun, Mar 2, 2014 at 4:50 AM, Simon Riggs si...@2ndquadrant.com wrote: v20 includes slightly re-ordered checks in GetLockLevel, plus more detailed comments on each group of subcommands. Also corrects grammar as noted by Vik. Plus adds an example of usage to the docs. This patch contains a one line change to src/bin/pg_dump/pg_backup_archiver.c which seems not to belong. This hunk in ATRewriteCatalogs() looks scary: + /* +* If we think we might need to add/re-add toast tables then +* we currently need to hold an AccessExclusiveLock. +*/ + if (lockmode AccessExclusiveLock) + return; It would make sense to me to add an Assert() or elog() check inside the subsequent loop to verify that the lock level is adequate ... but just returning silently seems like a bad idea. OK, I will check elog. I have my doubts about whether it's safe to do AT_AddInherit, AT_DropInherit, AT_AddOf, or AT_DropOf with a full lock. All of those can change the tuple descriptor, and we discussed, back when we did this the first time, the fact that the executor may get *very* unhappy if the tuple descriptor changes in mid-execution. I strongly suspect these are unsafe with less than a full AccessExclusiveLock. I'm happy to change those if you feel there is insufficient evidence. Not sure what that means, but yes, I think the lock levels on those need to be increased. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST support for inet datatypes
Robert Haas robertmh...@gmail.com writes: On Thu, Feb 27, 2014 at 12:30 PM, Tom Lane t...@sss.pgh.pa.us wrote: The value in it is roughly the same as the reason we don't include a version number when dumping CREATE EXTENSION. If you had a default opclass in the source database, you probably want a default opclass in the destination, even if that's not bitwise the same as what you had before. The implication is that you want best practice for the current PG version. I don't think that argument holds a lot of water in this instance. The whole reason for having multiple opclasses that each one can implement different comparison behavior. Well, I doubt we'd accept a proposal to change the default opclass of a datatype to something that had incompatible behavior --- but we might well accept one to change it to something that had improved behavior, such as more operators. The first couple dozen lines in GetIndexOpClass() make for interesting reading in this context. That approach to cross-version compatibility probably doesn't work in the pg_upgrade universe, of course; but what I'd like to point out here is that those kluges wouldn't have been necessary in the first place if pg_dump had had the suppress-default-opclasses behavior at the time. (No, it didn't always do that; cf commits e5bbf1965 and 1929a90b6.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
* Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] UNION ALL on partitioned tables won't use indices.
On Mon, Mar 03, 2014 at 07:01:09PM +0900, Kyotaro HORIGUCHI wrote: Yes, the old dumped version of typ2 patch did so. It flattened appendrel tree for the query prpoerly. Let me hear the reson you prefer to do so. Having reviewed my upthread reasoning for preferring one of those two approaches over the other, it's a weak preference. They have similar runtime costs. Putting the logic with the code that creates appendrels reduces the number of code sites one must examine to reason about possible plan structures. We might not flatten RTE_RELATION appendrel parents exactly the same way we flatten RTE_SUBQUERY appendrel parents. I would tend to leave inh=true for the former, for reasons explained in my notes on v7, but set inh=false for the latter to save needless work. Unfortunately, RTE_SUBQUERY-es with their inh flag cleard might cause something inconvenient in preprocess_minmax_aggregates() and/or add_rtes_to_flat_rtable().. preprocess_minmax_aggregates() only cares about RTEs reachable from the join tree, and the appendrel parents made obsolete by flattening would not be reachable from the join tree. add_rtes_to_flat_rtable() might be unhappy. # I haven't found that related to sepgsql, though :-( I understood that your concern is to deal parent RTEs defferently according to their relkinds. But I think the inh flags could not be modified at all for the reason mentioned above. That's fine, then. It was a minor point. If you are convinced that a separate flattening pass is best, that suffices for me at this stage. Please submit the patch you have in mind, incorporating any improvements from the v7 patch that are relevant to both approaches. At least as of now, inheritance tables define the bottom bound of a appendrel tree and on the other hand complex(?) union_alls define the upper bound, and both multilevel (simple)union_alls and inheritances are flattened individually so all possible inheritance tree to be collapsed by this patch is only, I think, Subquery(inh=1)[Relation-inhparent [Relation-child, child, child]] On the other hand, a flattening pass is less code overall and brings an attractive uniformity-by-default to the area. Focusing only on the above structure, what we should do to collapse this tree is only connect the childs to the Subquery directly, then remove all appendrelinfos connecting to the Relation-inhparent. inh flag need not to be modified. # Umm. I strongly suspect that I overlooked something.. Then even widening to general case, the case doesn't seem to change. All we need to do is, link a child to its grandparent and isolate the parent removing apprelinfos. I barely follow what you're saying here. Do you speak of union_inh_idx_typ2_v2_20131113.patch, unionall_inh_idx_typ3_v7.patch, or some future design? If we use a separate flattening pass, there's no small limit on how many layers of appendrel we may need to flatten. pull_up_subqueries() can create many nested RTE_SUBQUERY appendrel layers; there may be more than just child, parent and grandparent. There's never more than one layer of RTE_RELATION appendrel, though. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: option --if-exists for pg_dump
Pavel Stehule escribió: This patch has redesigned implementation --if-exists for pg_dumpall. Now it is not propagated to pg_dump, but used on pg_dumpall level. Seems sane, thanks. BTW after this patch, I still don't see an error-free output from restoring a database on top of itself. One problem is plpgsql, which is now an extension, so pg_dump emits this error message: ERROR: cannot drop language plpgsql because extension plpgsql requires it SUGERENCIA: You can drop extension plpgsql instead. Another problem is that some DROP commands don't work. For instance, if the public schema in the target database contains objects that haven't been dropped yet, the DROP command will fail: ERROR: cannot drop schema public because other objects depend on it DETALLE: function bt_metap(text) depends on schema public function bt_page_items(text,integer) depends on schema public function bt_page_stats(text,integer) depends on schema public function f() depends on schema public function get_raw_page(text,integer) depends on schema public function heap_page_items(bytea) depends on schema public function locate_tuple_corruption() depends on schema public function page_header(bytea) depends on schema public SUGERENCIA: Use DROP ... CASCADE to drop the dependent objects too. (The way I got this was by using my 8.2 installation, on which I ran the regression tests; then I dumped the resulting regression database. The database on which I restored wasn't clean, as it contained unrelated junk in the public schema.) Not sure what's the right answer here to this problem, but it cannot be attributed to this patch anyway. I'm about to push this, since other than the above problems, this functionality seems to be working as designed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. IIRC, the reason you'd have to do that is to make the table contents appear on slave servers. If you don't consider replication then it might seem easier. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC on WAL-logging hash indexes
Thanks for alerting me to your previous idea. While I don't know enough about Postgresql internals to judge its merits yet, I'll write some pseudocode based on it in my proposal; and I'll relegate it to a reach proposal alongside a more straightforward one. Tan On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Mar 2, 2014 at 2:38 PM, Tan Tran tankimt...@gmail.com wrote: 2. Proposal As a GSoC student, I will implement WAL recovery of hash indexes using the other index types' WAL code as a guide. Roughly, I will: - Devise a way to store and retrieve hashing data within the XLog data structures. - In the existing skeleton for hash_redo(XLogRecPtr lsn, XLogRecord *record) in hash.c, branch to code for the various redo operations: creating an index, inserting into an index, deleting an index, and page operations (split, delete, update?). - Code each branch by drawing on examples from btree_redo, gin_redo, and gist_redo, the existing XLog code of the other index types. Unfortunately, I don't believe that it's possible to do this easily today because of the way bucket splits are handled. I wrote about this previously here, with an idea for solving the problem: http://www.postgresql.org/message-id/ca+tgmozymojsrfxhxq06g8jhjxqcskvdihb_8z_7nc7hj7i...@mail.gmail.com Sadly, no one responded. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On 03/03/2014 05:22 PM, Tom Lane wrote: Stephen Frost sfr...@snowman.net writes: ... ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. IIRC, the reason you'd have to do that is to make the table contents appear on slave servers. If you don't consider replication then it might seem easier. So switch on logging and then perform CLUSTER/VACUUM FULL ? Should this work, or is something extra needed ? Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On 2014-03-03 12:08:26 -0500, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. I don't see that as a particularly problematic problem. The primary reason to want to convert a unlogged to a logged table probably is that it's easier to do so than to recreate the table + dependencies. Also the overhead of logging full pages will be noticeably smaller than the overhead of adding all rows individually, even if using heap_multi_insert(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Mon, Mar 03, 2014 at 10:19:55AM -0500, Robert Haas wrote: On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Removing SELECT privilege while running a SELECT would be a different matter. This is all a matter of definition; we can make up any rules we like. Doing so is IMHO a separate patch and not something to hold up the main patch. So I think this is an interesting point. There are various things that could go wrong as a result of using the wrong lock level. Worst would be that the server crashes or corrupts data. A little less bad would be that sessions error out with inexplicable error conditions, as in SnapshotNow days. Alternatively, we could just have arguably wrong behavior, like basing query results on the old version of the table's metadata even after it's been changed. I would order the concerns like this: 1. Data corruption 2. Transient, clearly-wrong answers without an error 3. Server crash 4. Catalog logical inconsistency 5. Inexplicable, transient errors 6. Valid behavior capable of surprising more than zero upgraders I don't really care about that second category of behavior. If somebody changes some property of a table and existing sessions continue to use the old value until eoxact, well, we can argue about that, but at least until we have concrete reports of really undesirable behavior, I don't think it's the primary issue. What I'm really concerned about is whether there are other things like the SnapshotNow issues that can cause stuff to halt and catch fire. I don't know whether there are or are not, but that's my concern. Since we can't know whether something qualifies as (2) or (6) without analyzing it, I don't find waiting for user complaints to be a good strategy here. An ownership change not immediately affecting ACL checks does fall under (6), for me. (However, changing ownership without AccessExclusiveLock might also create hazards in category (4) for concurrent DDL that performs owner checks.) -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Mon, Mar 03, 2014 at 03:43:46PM +, Simon Riggs wrote: The question is are there any specific areas of concern here? If not, then we commit because we've done a lot of work on it and at the moment the balance is high benefit to users against a non-specific feeling of risk. @Noah - Last call... I am not specifically aware of any outstanding problems. I have planned to give this a close look, but it will be at least two weeks before I dig out far enough to do so. If that makes it a post-commit review, so be it. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNION ALL on partitioned tables won't use indices.
Noah Misch n...@leadboat.com writes: If you are convinced that a separate flattening pass is best, that suffices for me at this stage. Please submit the patch you have in mind, incorporating any improvements from the v7 patch that are relevant to both approaches. I went back and re-read the original message, and this time it struck me that really the issue here is that add_child_rel_equivalences() doesn't think it has to deal with the case of a parent rel being itself a child. That's not inherent though; it's just that it didn't occur to me at the time that such a situation could arise. It takes only very small changes to allow that to happen. If you do that, as in the attached changes to equivclass.c, you get could not find pathkey item to sort errors from createplan.c; but that's just because create_merge_append_plan() is likewise not expecting the mergeappend's parent rel to be itself a child. Fix for that is a one-liner, ie, pass down relids. That gets you to a point where the code generates a valid plan, but it's got nested MergeAppends, which are unnecessary expense. Kyotaro-san's original fix for that was overcomplicated. It's sufficient to teach accumulate_append_subpath() to flatten nested MergeAppendPaths. In short, the attached patch seems to fix it, for a lot less added complication than anything else that's been discussed on this thread. I've only lightly tested it (it could use a new regression test case), but unless someone can find a flaw I think we should use this approach. regards, tom lane diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 03be7b1..5777cb2 100644 *** a/src/backend/optimizer/path/allpaths.c --- b/src/backend/optimizer/path/allpaths.c *** get_cheapest_parameterized_child_path(Pl *** 1021,1030 * accumulate_append_subpath * Add a subpath to the list being built for an Append or MergeAppend * ! * It's possible that the child is itself an Append path, in which case ! * we can cut out the middleman and just add its child paths to our ! * own list. (We don't try to do this earlier because we need to ! * apply both levels of transformation to the quals.) */ static List * accumulate_append_subpath(List *subpaths, Path *path) --- 1021,1035 * accumulate_append_subpath * Add a subpath to the list being built for an Append or MergeAppend * ! * It's possible that the child is itself an Append or MergeAppend path, in ! * which case we can cut out the middleman and just add its child paths to ! * our own list. (We don't try to do this earlier because we need to apply ! * both levels of transformation to the quals.) ! * ! * Note that if we omit a child MergeAppend in this way, we are effectively ! * omitting a sort step, which seems fine: if the parent is to be an Append, ! * its result would be unsorted anyway, while if the parent is to be a ! * MergeAppend, there's no point in a separate sort on a child. */ static List * accumulate_append_subpath(List *subpaths, Path *path) *** accumulate_append_subpath(List *subpaths *** 1036,1041 --- 1041,1053 /* list_copy is important here to avoid sharing list substructure */ return list_concat(subpaths, list_copy(apath-subpaths)); } + else if (IsA(path, MergeAppendPath)) + { + MergeAppendPath *mpath = (MergeAppendPath *) path; + + /* list_copy is important here to avoid sharing list substructure */ + return list_concat(subpaths, list_copy(mpath-subpaths)); + } else return lappend(subpaths, path); } diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 35d2a83..ac12f84 100644 *** a/src/backend/optimizer/path/equivclass.c --- b/src/backend/optimizer/path/equivclass.c *** add_child_rel_equivalences(PlannerInfo * *** 1937,1952 if (cur_ec-ec_has_volatile) continue; ! /* No point in searching if parent rel not mentioned in eclass */ ! if (!bms_is_subset(parent_rel-relids, cur_ec-ec_relids)) continue; foreach(lc2, cur_ec-ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); ! if (cur_em-em_is_const || cur_em-em_is_child) ! continue; /* ignore consts and children here */ /* Does it reference parent_rel? */ if (bms_overlap(cur_em-em_relids, parent_rel-relids)) --- 1937,1956 if (cur_ec-ec_has_volatile) continue; ! /* ! * No point in searching if parent rel not mentioned in eclass; but ! * we can't tell that for sure if parent rel is itself a child. ! */ ! if (parent_rel-reloptkind == RELOPT_BASEREL ! !bms_is_subset(parent_rel-relids, cur_ec-ec_relids)) continue; foreach(lc2, cur_ec-ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); ! if (cur_em-em_is_const) ! continue; /* ignore consts here */ /* Does it
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
On 03/03/2014 11:34 AM, Christian Kruse wrote: Hi, Attached is a patch with the updated documentation (now uses consistently huge pages) as well as a renamed GUC, consistent wording (always use huge pages) as well as renamed variables. Hmm, I wonder if that could now be misunderstood to have something to do with the PostgreSQL page size? Maybe add the word memory or operating system in the first sentence in the docs, like this: Enables/disables the use of huge memory pages. Accepted, see attached patch. Thanks, committed! I spotted this in section 17.4.1 Shared Memory and Semaphores: Linux The default maximum segment size is 32 MB, and the default maximum total size is 2097152 pages. A page is almost always 4096 bytes except in unusual kernel configurations with huge pages (use getconf PAGE_SIZE to verify). It's not any more wrong now than it's always been, but I don't think huge pages ever affect PAGE_SIZE... Could I cajole you into rephrasing that, too? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb
Jeff Janes jeff.ja...@gmail.com wrote: But I do wonder what experience people have with the 3 stage process, how useful is it empirically? If you can't open the database for general use until the 3rd phase is done, then you would just jump to doing that stage, rather than working through all 3 of them. If you can open the database and muddle through without statistics for a while, why not muddle through for the little bit longer that it would take to collect the full set right off the bat, rather than making intermediate passes? It's not always a little bit of time. For a description of my experience with a home-grown 3 stage process before one was built into pg_upgrade, see this post: http://www.postgresql.org/message-id/1373465348.51692.yahoomail...@web162906.mail.bf1.yahoo.com Basically, we cut our down time from hours to minutes without serious impairment of performance. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 3 March 2014 18:57, Noah Misch n...@leadboat.com wrote: On Mon, Mar 03, 2014 at 10:19:55AM -0500, Robert Haas wrote: On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Removing SELECT privilege while running a SELECT would be a different matter. This is all a matter of definition; we can make up any rules we like. Doing so is IMHO a separate patch and not something to hold up the main patch. So I think this is an interesting point. There are various things that could go wrong as a result of using the wrong lock level. Worst would be that the server crashes or corrupts data. A little less bad would be that sessions error out with inexplicable error conditions, as in SnapshotNow days. Alternatively, we could just have arguably wrong behavior, like basing query results on the old version of the table's metadata even after it's been changed. I would order the concerns like this: 1. Data corruption 2. Transient, clearly-wrong answers without an error 3. Server crash 4. Catalog logical inconsistency 5. Inexplicable, transient errors 6. Valid behavior capable of surprising more than zero upgraders I like your model for risk assessment. How can we apply it in detail in a way that helps us decide? Or do we just go on gut feel? My experience with mentioning such topics is that without structure it results in an assessment of unacceptable risk just simply because somebody has mentioned some scary words. I don't really care about that second category of behavior. If somebody changes some property of a table and existing sessions continue to use the old value until eoxact, well, we can argue about that, but at least until we have concrete reports of really undesirable behavior, I don't think it's the primary issue. What I'm really concerned about is whether there are other things like the SnapshotNow issues that can cause stuff to halt and catch fire. I don't know whether there are or are not, but that's my concern. Since we can't know whether something qualifies as (2) or (6) without analyzing it, I don't find waiting for user complaints to be a good strategy here. An ownership change not immediately affecting ACL checks does fall under (6), for me. (However, changing ownership without AccessExclusiveLock might also create hazards in category (4) for concurrent DDL that performs owner checks.) err, guys, you do realise that changing ownership is staying at AccessExclusiveLock in this patch? (and I haven't ever suggested lowering that). -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 04/03/14 04:25, Oleg Bartunov wrote: On Mon, Mar 3, 2014 at 7:22 PM, Andres Freund and...@2ndquadrant.com wrote: [...] PS: Not a native speaker either... That's explain all :) [...] I AM a native English speaker born in England - though if you read some of my postings where I've been particularly careless, you well assume otherwise! My Chinese wife sometimes corrects my English, and from time to time she is right! To the extent that I've read the postings of non-native English speakers like Oleg Andres, I've not noticed any difficulty understanding what they meant - other than technical issues that would also be the same for me from gifted native English speakers! Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? Another interesting project around unlogged tables would be to make it possible to have unlogged indexes on fully-logged tables. That is something that there was some discussion of before, that might be easier. FWIW, I don't think that TODO page is a very good resource for finding a starter project. Picking a good project is a skill in and of itself. A lot of that stuff is aspirational, either because it's difficult, or, more often, because it's difficult relative to the rewards, which can be quite low. To be honest, if I have what I imagine to be a great idea for a project, I don't put it on that page. Maybe I should, but I don't, and I don't think that is uncommon. This is not because I'm particularly guarded about sharing the information. Why do you think that hash indexes still aren't WAL-logged after all these years (a project that someone made noise about recently in relation to GSoC), even though that's generally considered to be a SMOP? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On 2014-03-03 12:44:26 -0800, Peter Geoghegan wrote: On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? Another interesting project around unlogged tables would be to make it possible to have unlogged indexes on fully-logged tables. That is something that there was some discussion of before, that might be easier. I'd actually say it's harder because it requires modifying the catalog or transparently introducing hacks similar to what unlogged matviews are doing, to make sure the index is marked invalid after a crash restart. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changeset Extraction v7.9
On Mon, Mar 3, 2014 at 11:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-27 17:56:08 +0100, Andres Freund wrote: * do we modify struct SnapshotData to be polymorphic based on some tag or move comments there? I tried that, and it got far to invasive. So I've updated the relevant comment in snapshot.h, inl * How/whether to change the exclusive lock on the ProcArrayLock in CreateInitDecodingContext() I looked at this, and I believe the current code is the best solution. It's pretty far away from any hot codepath and it's a short operation. I liked the idea about using snapmgr.c for this in principle, but it doesn't have enough smarts by far... So, attached is the newest version: * Management of historic/timetravel snapshot is now done by snapmgr.c, not tqual.c anymore. No -satisfies pointers are redirected anymore * removal of the suspend logic for historic snapshot, instead the one place that needed it, now explicitly uses a snapshot * removal of some pointless CREATE EXTENSIONs from the regression tests * splitoff of the slot tests that aren't committable into a separate commit. * minor doc adjustments I am not aware of any further things that need to be fixed now (in contrast to features for later releases of which there are aplenty). OK, I've committed the 0001 patch, which is the core of this feature, with a bit of minor additional hacking. I'm sure there are some problems here yet and some things that people will want fixed, as is inevitable for any patch of this size. But I don't have any confidence that further postponing commit is going to be the best way to find those issues, so in it goes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] building pgadmin4
Hi, I'm trying to build pgadmin4, out of curiosity. I'm on a ubuntu 13.10 desktop vm. I added qt webkitwidgets, and now I run into the next error, which doesn't seem to make much sense: wbloos2@vm1:~/pgadmin4/runtime$ qmake Project MESSAGE: Building for QT5+... Project ERROR: Unknown module(s) in QT: quick I haven't found the word quick in any of the code and there's no .qml file. any clues? Cheers, WBL -- Quality comes from focus and clarity of purpose -- Mark Shuttleworth
Re: [HACKERS] building pgadmin4
On Mon, Mar 3, 2014 at 4:53 PM, Willy-Bas Loos willy...@gmail.com wrote: I'm trying to build pgadmin4, out of curiosity. I'm on a ubuntu 13.10 desktop vm. I added qt webkitwidgets, and now I run into the next error, which doesn't seem to make much sense: wbloos2@vm1:~/pgadmin4/runtime$ qmake Project MESSAGE: Building for QT5+... Project ERROR: Unknown module(s) in QT: quick I haven't found the word quick in any of the code and there's no .qml file. any clues? I suggest that you post to one of the pgadmin mailing lists: http://www.postgresql.org/list/pgadmin-support/ http://www.postgresql.org/list/pgadmin-hackers/ pgAdmin is off-topic for this mailing list. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
On Wed, Jan 15, 2014 at 1:34 AM, Marko Tiikkaja ma...@joh.to wrote: Hi all! It's me again, trying to find a solution to the most common mistakes I make. This time it's accidental shadowing of variables, especially input variables. I've wasted several hours banging my head against the wall while shouting HOW CAN THIS VARIABLE ALWAYS BE NULL?. I can't believe I'm the only one. To give you a rough idea on how it works: +1 I've made the same mistake countless of times. For me, it always happens when I have a declared variable in a function which I later on make an input parameter instead and forget to remove it under the declare section of the function. Regarding the warning vs error discussion, I think it depends on if we want to maximize short-term or long-term user-friendliness. In the short-term it's more user-friendly to lie to the user and pretend everything is fine by not crashing the PL/pgSQL-application and instead writing warning messages in the log file which might not be seen. But in the long-term the user will run into problems caused by the bugs. With MySQL, invalid dates are accepted unless special settings are turned on, but yes, warnings are emitted which most implementations ignore. This is bad. I strongly think it should be made an error, because it is most certainly an error, and even if it's not, it's at least bad coding style and the code should be fixed anyway, or if one is lazy, turn this off in the config file and make it a warning instead. I don't think we should be too afraid to break backward compability if it only affects a theoretical percentage of the users in a new major release. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
KaiGai, * Stephen Frost (sfr...@snowman.net) wrote: And I'm still unconvinced of this approach and worry that it's going to break more often than it works. That's my 2c on it, but I won't get in the way if someone else wants to step up and support it. Alright, having heard from Robert (thanks!) regarding his thoughts (which are pretty similar to my own, though he doesn't anticipate issues with API changes), I'm going to step back a bit form the above position. As I mentioned up-thread, I'd really like to see FDW join push-down, FDW aggregate push-down, parallel query execution, and parallel remote-FDW execution and I don't see this CustomScan approach as the right answer to any of those. In accordance with the above, what I'd like to see with this patch is removal of the postgres_fdw changes and any changes which were for that support. In addition, I'd like to understand why 'ctidscan' makes any sense to have as an example of what to use this for- if that's valuable, why wouldn't we simply implement that in core? I do want an example in contrib of how to properly use this capability, but I don't think that's it. For one thing, an example where you could have this CustomScan node calling other nodes underneath would be interesting. I realize the CTID scan can't do that directly but I would think your GPU-based system could; after all, if you're running a join or an aggregate with the GPU, the rows could come from nearly anything. Have you considered that, or is the expectation that users will just go off and access the heap and/or whatever indexes directly, like ctidscan does? How would such a requirement be handled? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] building pgadmin4
pgAdmin is off-topic for this mailing list. so sorry, i misread the adress in the readme file cheers, WBL -- Quality comes from focus and clarity of purpose -- Mark Shuttleworth
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Mon, Mar 03, 2014 at 07:19:45PM +, Simon Riggs wrote: On 3 March 2014 18:57, Noah Misch n...@leadboat.com wrote: On Mon, Mar 03, 2014 at 10:19:55AM -0500, Robert Haas wrote: On Thu, Feb 27, 2014 at 3:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Removing SELECT privilege while running a SELECT would be a different matter. This is all a matter of definition; we can make up any rules we like. Doing so is IMHO a separate patch and not something to hold up the main patch. So I think this is an interesting point. There are various things that could go wrong as a result of using the wrong lock level. Worst would be that the server crashes or corrupts data. A little less bad would be that sessions error out with inexplicable error conditions, as in SnapshotNow days. Alternatively, we could just have arguably wrong behavior, like basing query results on the old version of the table's metadata even after it's been changed. I would order the concerns like this: 1. Data corruption 2. Transient, clearly-wrong answers without an error 3. Server crash 4. Catalog logical inconsistency 5. Inexplicable, transient errors 6. Valid behavior capable of surprising more than zero upgraders I like your model for risk assessment. How can we apply it in detail in a way that helps us decide? Or do we just go on gut feel? My experience with mentioning such topics is that without structure it results in an assessment of unacceptable risk just simply because somebody has mentioned some scary words. True; it's tough to make use of such a prioritization. By the time you can confidently assign something to a category, you can probably just fix it. (Or, in the case of category (6), document/release-note it.) Just to be clear, that list is not a commentary on the particular patch at hand. Those are merely the kinds of regressions to look for in a patch affecting this area of the code. err, guys, you do realise that changing ownership is staying at AccessExclusiveLock in this patch? (and I haven't ever suggested lowering that). Yep. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changeset Extraction v7.9
Hi Robert, Everyone! On 2014-03-03 16:48:15 -0500, Robert Haas wrote: OK, I've committed the 0001 patch, which is the core of this feature, with a bit of minor additional hacking. Many, many, thanks! I'm sure there are some problems here yet and some things that people will want fixed, as is inevitable for any patch of this size. But I don't have any confidence that further postponing commit is going to be the best way to find those issues, so in it goes. Unsurprisingly I do agree with this. It's a big feature, and there's imperfection. But I think it's a good start. A very first such imperfection is that the buildfarm doesn't actually excercise make check in contribs, just make installcheck... Which this patch doesn't use because the tests require wal_level=logical and max_replication_slots = 2. Andrew said on IRC that maybe it's a good idea to add a make-contrib-check stage to the buildfarm. A patch fixing a couple of absolutely trivial things is attached. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c index 4fb0974..3c56238 100644 --- a/src/backend/replication/logical/logical.c +++ b/src/backend/replication/logical/logical.c @@ -9,7 +9,7 @@ * * NOTES *This file coordinates interaction between the various modules that - *together providethe logical decoding, primarily by providing so + *together provide logical decoding, primarily by providing so *called LogicalDecodingContexts. The goal is to encapsulate most of the *internal complexity for consumers of logical decoding, so they can *create and consume a changestream with a low amount of code. diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c index 3b8ae38..5fa1848 100644 --- a/src/backend/replication/logical/logicalfuncs.c +++ b/src/backend/replication/logical/logicalfuncs.c @@ -2,7 +2,7 @@ * * logicalfuncs.c * - * Support functions for using logical decoding and managemnt of + * Support functions for using logical decoding and management of * logical replication slots via SQL. * * @@ -400,7 +400,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin ctx-options.output_type != OUTPUT_PLUGIN_TEXTUAL_OUTPUT) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg(output plugin cannot produce text output))); + errmsg(output plugin cannot produce binary output))); ctx-output_writer_private = p; diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h index 4b25607..8ee9285 100644 --- a/src/include/utils/snapshot.h +++ b/src/include/utils/snapshot.h @@ -36,7 +36,7 @@ typedef bool (*SnapshotSatisfiesFunc) (HeapTuple htup, * There are several different kinds of snapshots: * * Normal MVCC snapshots * * MVCC snapshots taken during recovery (in Hot-Standby mode) - * * Historic MVCC snapshots used during logical decoding + * * Historic MVCC snapshots used during logical decoding * * snapshots passed to HeapTupleSatisfiesDirty() * * snapshots used for SatisfiesAny, Toast, Self where no members are * accessed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
As I mentioned up-thread, I'd really like to see FDW join push-down, FDW aggregate push-down, parallel query execution, and parallel remote-FDW execution and I don't see this CustomScan approach as the right answer to any of those. In accordance with the above, what I'd like to see with this patch is removal of the postgres_fdw changes and any changes which were for that support. I don't argue this approach. It might be useful to *demonstrate* how custom- scan node works as replacement of join, however, In addition, I'd like to understand why 'ctidscan' makes any sense to have as an example of what to use this for- if that's valuable, why wouldn't we simply implement that in core? I do want an example in contrib of how to properly use this capability, but I don't think that's it. Do you think it makes sense if my submission was only interface portion without working example? The purpose of ctidscan module is, similar to postgres_fdw, to demonstrate the usage of custom-scan interface with enough small code scale. If tons of code example were attached, nobody will want to review the patch. The cache_scan module that I and Haribabu are discussing in another thread also might be a good demonstration for custom-scan interface, however, its code scale is a bit larger than ctidscan. For one thing, an example where you could have this CustomScan node calling other nodes underneath would be interesting. I realize the CTID scan can't do that directly but I would think your GPU-based system could; after all, if you're running a join or an aggregate with the GPU, the rows could come from nearly anything. Have you considered that, or is the expectation that users will just go off and access the heap and/or whatever indexes directly, like ctidscan does? How would such a requirement be handled? In case when custom-scan node has underlying nodes, it shall be invoked using ExecProcNode as built-in node doing, then it will be able to fetch tuples come from underlying nodes. Of course, custom-scan provider can perform the tuples come from somewhere as if it came from underlying relation. It is responsibility of extension module. In some cases, it shall be required to return junk system attribute, like ctid, for row-level locks or table updating. It is also responsibility of the extension module (or, should not add custom- path if this custom-scan provider cannot perform as required). Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
Joel Jacobson j...@trustly.com writes: I strongly think it should be made an error, because it is most certainly an error, and even if it's not, it's at least bad coding style and the code should be fixed anyway, or if one is lazy, turn this off in the config file and make it a warning instead. You're reasoning from a false premise: it's *not* necessarily an error. If this were such a bad idea as you claim, generations of programming language designers wouldn't have made their languages work like this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Noah Misch n...@leadboat.com writes: Just to be clear, that list is not a commentary on the particular patch at hand. Those are merely the kinds of regressions to look for in a patch affecting this area of the code. A complaint on pgsql-bugs just now reminded me of a specific area that needs to be looked at hard: how bad are the implications for pg_dump? Up to now, pg_dump could be reasonably confident that once it had AccessShareLock on every table it intended to dump, there would be no schema changes happening on those tables until it got done. This greatly ameliorates the snapshot-skew problems that arise from its habit of doing some things for itself and other things via backend-internal functions (which historically used SnapshotNow and now use a fresh MVCC snapshot, either way potentially quite newer than the transaction snapshot pg_dump's own queries will use). I suspect that lowering the lock requirements for anything that affects what pg_dump can see is going to make things a whole lot worse in terms of consistency of pg_dump output in the face of concurrent DDL. Admittedly, we're not perfect in that area now, but I'm not sure that's an adequate excuse for making it worse. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Feb 28, 2014 at 2:12 PM, Peter Geoghegan p...@heroku.com wrote: In order to make a rational decision to do the work incrementally, we need to know what we're putting off until 9.5. AFAICT, we have these operator classes that work fine with jsonb for the purposes of hstore-style indexing (the hstore operator classes). Wasn't that the point? When there is a dedicated set of jsonb operator classes, what will be different about them, other than the fact that they won't be hstore operator classes? A decision predicated on waiting for those to come in 9.5 must consider what we're actually waiting for, and right now that seems very hazy. I really would like an answer to this question. Even if I totally agreed with Andrew's assessment of the relative importance of having jsonb be an in-core type, versus having some more advanced indexing capabilities right away, this is still a very salient question. I appreciate that the put jsonb in hstore extension to get indexing right away trade-off is counter-intuitive, and it may even be that there is an everybody wins third way that sees us factor out code that is common to both jsonb and hstore as it exists today (although I'm not optimistic about that). I would like to emphasize that if you want to defer working on hstore-style jsonb operator classes for one release, I don't necessarily have a problem with that. But, I must insist on an answer here, from either you or Oleg or Teodor (it isn't apparent that Oleg and Teodor concur with you on what's important): what did we run out of time for? What will be different about the jsonb operator classes that you're asking us to wait for in a future release? I understand that there are ambitious plans for a VODKA-am that will support indexing operations on nested structures that are a lot more advanced than those enabled by the hstore operator classes included in these patches. However, surely these hstore operator classes have independent value, or represent incremental progress? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
On Mon, Mar 03, 2014 at 01:29:00AM -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: Concerning the immediate fix for non-Windows systems, does any modern system ignore modes of Unix domain sockets? It appears to be a long-fixed problem: What I was envisioning was that we'd be relying on the permissions of the containing directory to keep out bad guys. Permissions on the socket itself might be sufficient, but what does it save us to assume that? My first preference is to use the simplest code that POSIX requires to have the behavior we desire. POSIX specifies as implementation-defined whether connect() checks filesystem permissions. That's true of both directory search permissions and permissions on the socket itself. POSIX alone can't help us here. My second preference is to use the simplest code known to be portable to all credible PostgreSQL target systems. Brief research was inconclusive, but it turned up no solid evidence of a modern target ignoring socket permissions. (It did turn up solid evidence of 15-year-old targets having that problem.) I found no evidence either way concerning the prevalence of systems that ignore directory search permissions above sockets. I don't care for interposing a directory based solely on the fact that some ancient systems needed that. Changing unix_socket_permissions is a one-liner in each test driver. Placing the socket in a directory entails setting PGHOST in the psql and postmaster environments and cleaning up the directory on exit. That would be fine if restricted to pg_regress, but it would also show up in contrib/pg_upgrade/test.sh, perhaps eventually in vcregress.pl:upgradecheck(), perhaps in the buildfarm code, in the DBD::Pg test suite, and in any other test suite that creates a temporary cluster. We should not lead all those test drivers into using a temporary socket directory based on long-gone bugs or cargo cult programming. If there are notable systems today where it helps, that's a different matter. Also, test drivers should not be the sole place where we express doubt about the reliability of socket permissions. If they are unreliable on a noteworthy target, then the unix_socket_permissions documentation ought to say so. nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/03/2014 04:50 PM, Peter Geoghegan wrote: I understand that there are ambitious plans for a VODKA-am that will support indexing operations on nested structures that are a lot more advanced than those enabled by the hstore operator classes included in these patches. However, surely these hstore operator classes have independent value, or represent incremental progress? Primary value is that in theory the hstore2 opclasses are available *now*, as opposed to a year from now. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 3, 2014 at 4:57 PM, Josh Berkus j...@agliodbs.com wrote: On 03/03/2014 04:50 PM, Peter Geoghegan wrote: I understand that there are ambitious plans for a VODKA-am that will support indexing operations on nested structures that are a lot more advanced than those enabled by the hstore operator classes included in these patches. However, surely these hstore operator classes have independent value, or represent incremental progress? Primary value is that in theory the hstore2 opclasses are available *now*, as opposed to a year from now. Well, yes, that's right. Although we cannot assume that VODKA will get into 9.5 - it's a big project. Nor is it obvious to me that a VODKA-ized set of operator classes would not bring with them exactly the same dilemma as we now face vis-a-vis hstore code reuse and GIN operator classes. So it seems reasonable to me to suppose that VODKA should not influence our decision here. Please correct me if I'm mistaken. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 2014-03-03 19:15:27 -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: Just to be clear, that list is not a commentary on the particular patch at hand. Those are merely the kinds of regressions to look for in a patch affecting this area of the code. A complaint on pgsql-bugs just now reminded me of a specific area that needs to be looked at hard: how bad are the implications for pg_dump? Up to now, pg_dump could be reasonably confident that once it had AccessShareLock on every table it intended to dump, there would be no schema changes happening on those tables until it got done. The guarantee wasn't actually that strong. It already was quite possible that indexes got created/dropped during that time, which probably is the by far most frequent DDL run in production. This greatly ameliorates the snapshot-skew problems that arise from its habit of doing some things for itself and other things via backend-internal functions (which historically used SnapshotNow and now use a fresh MVCC snapshot, either way potentially quite newer than the transaction snapshot pg_dump's own queries will use). Yea, I wonder if we shouldn't start to make them use a different snapshot. It's the pg_get_*def() functions, or is there something else? Afair (I really haven't rechecked) all the actions that have a changed locklevels affect things that pg_dump recreates clientside, using a repeatable read snapshot, so there shouldn't be much change there? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/03/2014 05:07 PM, Peter Geoghegan wrote: Primary value is that in theory the hstore2 opclasses are available *now*, as opposed to a year from now. Well, yes, that's right. Although we cannot assume that VODKA will get into 9.5 - it's a big project. Nor is it obvious to me that a VODKA-ized set of operator classes would not bring with them exactly the same dilemma as we now face vis-a-vis hstore code reuse and GIN operator classes. So it seems reasonable to me to suppose that VODKA should not influence our decision here. Please correct me if I'm mistaken. I would agree with you. Andrew was onsite with a client over the weekend, which is why you haven't heard from him on this thread. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
Noah Misch n...@leadboat.com writes: On Mon, Mar 03, 2014 at 01:29:00AM -0500, Tom Lane wrote: What I was envisioning was that we'd be relying on the permissions of the containing directory to keep out bad guys. Permissions on the socket itself might be sufficient, but what does it save us to assume that? My first preference is to use the simplest code that POSIX requires to have the behavior we desire. POSIX specifies as implementation-defined whether connect() checks filesystem permissions. That's true of both directory search permissions and permissions on the socket itself. Surely you are misinterpreting that. If it worked as you suggest, connect() would provide a trivial method of bypassing directory permissions, at least to the extent of being able to probe for the existence of files within supposedly-unreadable directories. I can believe that there are implementations that don't examine the permissions on the socket file itself, but not that there are any that disregard directory permissions during the file lookup. I found no evidence either way concerning the prevalence of systems that ignore directory search permissions above sockets. That's because the question is ridiculous on its face, so nobody ever bothered to address it. I think the burden is on you to show that there has ever been any system that read the spec the way you propose. I don't care for interposing a directory based solely on the fact that some ancient systems needed that. Changing unix_socket_permissions is a one-liner in each test driver. Placing the socket in a directory entails setting PGHOST in the psql and postmaster environments and cleaning up the directory on exit. Placing the socket anywhere besides the default location will require setting PGHOST anyway, so I don't see that this argument holds much water. The cleanup aspect is likewise not that exciting; pg_regress creates a lot of stuff it doesn't remove. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Andres Freund and...@2ndquadrant.com writes: On 2014-03-03 19:15:27 -0500, Tom Lane wrote: This greatly ameliorates the snapshot-skew problems that arise from its habit of doing some things for itself and other things via backend-internal functions (which historically used SnapshotNow and now use a fresh MVCC snapshot, either way potentially quite newer than the transaction snapshot pg_dump's own queries will use). Yea, I wonder if we shouldn't start to make them use a different snapshot. It's the pg_get_*def() functions, or is there something else? See past discussions. By the time you trace all of ruleutils.c's dependencies, a dauntingly large fraction of the backend's basic catalog access support is implicated. For example, you'd need some way of making the catcaches return data that they know to be outdated. And I'm pretty sure pg_dump is making free use of stuff that isn't even in ruleutils. I would like to see pg_dump doing something much more bulletproof, but I'm afraid that there isn't any nice simple fix available. One relatively narrow fix that would probably make it a lot better *in the current state of affairs* is to provide a way whereby, once pg_dump has locks on everything it wants to dump, it can advance its transaction snapshot to current time. Then at least both its own queries and the backend's lookups will see post-locking state. However, if AccessShareLock isn't enough to block DDL on the tables then we're still hosed. Afair (I really haven't rechecked) all the actions that have a changed locklevels affect things that pg_dump recreates clientside, using a repeatable read snapshot, so there shouldn't be much change there? You're missing the point entirely if you think pg_dump recreates everything client-side. It's never done that 100%, and we've migrated more and more stuff to the server side over time to avoid duplicate implementations of things like index expression decompilation. So while a theoretical answer would be to remove all of pg_dump's dependency on server-side functions, I am pretty sure that's never going to happen. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-security on updatable s.b. views
On 02/25/2014 01:28 AM, Dean Rasheed wrote: On 13 February 2014 04:12, Craig Ringer cr...@2ndquadrant.com wrote: It's crashing while pulling up the query over emp (hl7.employee) and part (hl7.participation). Given the simplicity of what the row-security code its self is doing, I'm wondering if this is a case that isn't handled in updatable s.b. views. I'll look into it. I'm not sure how much further you've got with this, but I think the issue is that the securityQuals that you're adding don't refer to the correct RTE. When adding securityQuals to an RTE, they are expected to have Vars whose varno matches the rt_index of the RTE (see for example the code in rewriteTargetView() which calls ChangeVarNodes() on viewqual before adding the qual to securityQuals or the main query jointree). prepend_row_security_quals() doesn't appear to have any similar code, and it would need to be passed the rt_index to do that. Thanks for the pointer. That was indeed the issue. I've pushed an update to the branch with the fix for varno handling. Thanks. It's tagged rls-9.4-upd-sb-views-v8 . I've almost run out of time to spend on row security for this commitfest, unfortunately. I'm putting a blog together with a current status update. Frustrating, as it's coming together now. Open issues include: - Passing plan inval items from rewriter into planner - COPY support pending - Clear syntax in DDL Most of the rest are solved; it's actually looking pretty good. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 2014-03-03 20:32:13 -0500, Tom Lane wrote: Afair (I really haven't rechecked) all the actions that have a changed locklevels affect things that pg_dump recreates clientside, using a repeatable read snapshot, so there shouldn't be much change there? You're missing the point entirely if you think pg_dump recreates everything client-side. No, I am not obviously not thinking that. What I mean is that the things that actually change their locking requirement in the proposed patch primarily influence things that are reconstructed clientside by pg_dump. E.g ALTER TABLE ... CLUSTER ON, SET(...), ... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
I wrote: Placing the socket anywhere besides the default location will require setting PGHOST anyway, so I don't see that this argument holds much water. The cleanup aspect is likewise not that exciting; pg_regress creates a lot of stuff it doesn't remove. There's another point here, if you think back to the discussion as it stood before we noticed that there was a security problem. What was originally under discussion was making life easier for packagers who want to build with a default socket location like /var/run/postgresql/. In a build environment, that directory may not exist, and even if it does, there is no way that the build user should have write permission on it. So it is already the case that some packagers have to override the socket location if they want to do make check while packaging, and what we were originally on about was making that part of the normal pg_regress procedures instead of being something requiring patching. So, whether or not unix_socket_permissions would be a bulletproof security fix by itself, I'd still want to see some provisions made for putting the socket into a local directory during make check. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 3, 2014 at 5:09 PM, Josh Berkus j...@agliodbs.com wrote: On 03/03/2014 05:07 PM, Peter Geoghegan wrote: Primary value is that in theory the hstore2 opclasses are available *now*, as opposed to a year from now. Well, yes, that's right. Although we cannot assume that VODKA will get into 9.5 - it's a big project. Nor is it obvious to me that a VODKA-ized set of operator classes would not bring with them exactly the same dilemma as we now face vis-a-vis hstore code reuse and GIN operator classes. So it seems reasonable to me to suppose that VODKA should not influence our decision here. Please correct me if I'm mistaken. I would agree with you. Good. Hopefully you also mean that you recognize the dilemma referred to above - that the hstore code reuse made a certain amount of sense, and that more than likely the best way forward is to work out a way to make it work. I'm not immediately all that concerned about what the least worst way of doing that is (I just favor putting everything in hstore on practical grounds). Another way to solve this problem might be to simply live with the code duplication between core and hstore on the grounds that hstore will eventually be deprecated as people switch to jsonb over time (so under that regime nothing new would ever be added to hstore, which we'd eventually remove from contrib entirely, while putting everything new here in core). I don't favor that approach, but it wouldn't be totally unreasonable, based on the importance that is attached to jsonb, and based on what I'd estimate to be the actual amount of code redundancy that that would create (assuming that hstore gets no new functions and operators, since an awful lot of the hstore-local code after this patch is applied is new to hstore). I wouldn't stand in the way of this approach. In my view the most important thing right now is that before anything is committed, at the very least there needs to be a strategy around getting hstore-style GIN indexing in jsonb. I cannot understand how you can have an operator class today that works fine for hstore-style indexing of jsonb (as far as that goes), but that that code is out of bounds just because it's nominally (mostly new) hstore code, and you cannot figure out a way of making that work that is acceptable from a code maintenance perspective. If you cannot figure that out in a few days, why should you ever be able to figure it out? We need to bite the bullet here, whatever that might actually entail. Can we agree on that much? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Andres Freund and...@2ndquadrant.com writes: On 2014-03-03 20:32:13 -0500, Tom Lane wrote: You're missing the point entirely if you think pg_dump recreates everything client-side. No, I am not obviously not thinking that. What I mean is that the things that actually change their locking requirement in the proposed patch primarily influence things that are reconstructed clientside by pg_dump. E.g ALTER TABLE ... CLUSTER ON, SET(...), ... [ raised eyebrow... ] I'm pretty sure that no such constraint was part of the design discussion to start with. Even if it accidentally happens to be the case now, it sounds utterly fragile. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/03/2014 07:50 PM, Peter Geoghegan wrote: On Fri, Feb 28, 2014 at 2:12 PM, Peter Geoghegan p...@heroku.com wrote: In order to make a rational decision to do the work incrementally, we need to know what we're putting off until 9.5. AFAICT, we have these operator classes that work fine with jsonb for the purposes of hstore-style indexing (the hstore operator classes). Wasn't that the point? When there is a dedicated set of jsonb operator classes, what will be different about them, other than the fact that they won't be hstore operator classes? A decision predicated on waiting for those to come in 9.5 must consider what we're actually waiting for, and right now that seems very hazy. I really would like an answer to this question. Even if I totally agreed with Andrew's assessment of the relative importance of having jsonb be an in-core type, versus having some more advanced indexing capabilities right away, this is still a very salient question. (Taking a break from some frantic customer work) My aim for 9.4, given constraints of both the development cycle and my time budget, has been to get jsonb to a point where it has equivalent functionality to json, so that nobody is forced to say well I'll have to use json because it lacks function x. For the processing functions, i.e. those that don't generate json from non-json, this should be true with what's proposed. The jsonb processing functions should be about as fast as, or in some cases significantly faster than, their json equivalents. Parsing text input takes a little longer (surprisingly little, actually), and reserializing takes significantly longer - I haven't had a chance to look and see if we can improve that. Both of these are more or less expected results. For 9.5 I would hope that we have at least the equivalent of the proposed hstore classes. But that's really just a start. Frankly, I think we need to think a lot harder about how we want to be able to index this sort of data. The proposed hstore operators appear to me to be at best just scratching the surface of that. I'd like to be able to index jsonb's # and # operators, for example. Unanchored subpath operators could be an area that's interesting to implement and index. I also would like to see some basic insert/update/delete/merge operators for json/jsonb - that's an area I wanted to work on for this lease but wasn't able to arrange. Note that future developments is a major topic of my pgcon talk, and I'm hoping that we can get some good discussion going there. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/03/2014 06:17 PM, Peter Geoghegan wrote: Good. Hopefully you also mean that you recognize the dilemma referred to above - that the hstore code reuse made a certain amount of sense, and that more than likely the best way forward is to work out a way to make it work. I'm not immediately all that concerned about what the least worst way of doing that is (I just favor putting everything in hstore on practical grounds). Well, I don't see how this is on practical grounds at this point. Whether we keep jsonb in core or not, we have an equal amount of work ahead of us. That's why I said the single-extension approach was conceptually simpler rather than actually simpler. It's easier to understand, not necessarily easier to implement at this point. Also, please recognize that the current implementation was what we collectively decided on three months ago, and what Andrew worked pretty hard to implement based on that collective decision. So if we're going to change course, we need a specific reason to change course, not just it seems like a better idea now or I wasn't paying attention then. The one extension to rule them all approach also has the issue of Naming Things, but I think that could be solved with a symlink or two. Another way to solve this problem might be to simply live with the code duplication between core and hstore on What code duplication? the grounds that hstore will eventually be deprecated as people switch to jsonb over time (so under that regime nothing new would ever be added to hstore, which we'd eventually remove from contrib entirely, while putting everything new here in core). I don't favor that approach, but it wouldn't be totally unreasonable, based on the importance that is attached to jsonb, and based on what I'd estimate to be the actual amount of code redundancy that that would create (assuming that hstore gets no new functions and operators, since an awful lot of the hstore-local code after this patch is applied is new to hstore). I wouldn't stand in the way of this approach. Realistically, hstore will never go away. I'll bet you a round or two of pints that, if we get both hstore2 and jsonb, within 2 years the users of jsonb will be an order of magnitude more numerous that then users of hstore, but folks out there have apps already built against hstore1 and they're going to keep on the hstore path. In the discussion you haven't apparently caught up on yet, we did discuss moving *hstore* into core to make this whole thing easier. However, that fell afoul of the fact that we currently have no mechanism to move types between extensions and core without breaking upgrades for everyone. So the only reason hstore is still an extension is because of backwards-compatibility. In my view the most important thing right now is that before anything is committed, at the very least there needs to be a strategy around getting hstore-style GIN indexing in jsonb. I think it's a good idea to have a strategy. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 3, 2014 at 6:54 PM, Andrew Dunstan and...@dunslane.net wrote: My aim for 9.4, given constraints of both the development cycle and my time budget, has been to get jsonb to a point where it has equivalent functionality to json, so that nobody is forced to say well I'll have to use json because it lacks function x. For the processing functions, i.e. those that don't generate json from non-json, this should be true with what's proposed. The jsonb processing functions should be about as fast as, or in some cases significantly faster than, their json equivalents. Parsing text input takes a little longer (surprisingly little, actually), and reserializing takes significantly longer - I haven't had a chance to look and see if we can improve that. Both of these are more or less expected results. Okay, that's fine. I'm sure that jsonb has some value without hstore-style indexing. That isn't really in question. What is in question is why you would choose to give up on those capabilities. For 9.5 I would hope that we have at least the equivalent of the proposed hstore classes. But the equivalent code to the proposed hstore operator classes is *exactly the same* C code. The two types are fully binary coercible in the patch, so why delay? Why is that additional step appreciably riskier than adopting jsonb? I don't see why the functions associated with the operators that comprise, say, the gin_hstore_ops operator class represent much additional risk, assuming that jsonb is itself in good shape. For example, the new hstore_contains() looks fairly innocuous compared to much of the code you are apparently intent on including in the first cut at jsonb. Have I missed something? Why are those operators riskier than the operators you are intent on including? If it is true that you think that's a significant additional risk, a risk too far, then it makes sense that you'd defer doing this. I would like to know why that is, though, since I don't see it. Speaking of missing operator classes, I'm pretty sure that it's ipso facto unacceptable that there is no default btree operator class for the type jsonb: [local]/postgres=# \d+ bar Table public.bar Column | Type | Modifiers | Storage | Stats target | Description +---+---+--+--+- i | jsonb | | extended | | Has OIDs: no [local]/postgres=# select * from bar order by i; ERROR: 42883: could not identify an ordering operator for type jsonb LINE 1: select * from bar order by i; ^ HINT: Use an explicit ordering operator or modify the query. LOCATION: get_sort_group_operators, parse_oper.c:221 Time: 6.424 ms [local]/postgres=# select distinct i from bar; ERROR: 42883: could not identify an equality operator for type jsonb LINE 1: select distinct i from bar; ^ LOCATION: get_sort_group_operators, parse_oper.c:226 Time: 6.457 ms But that's really just a start. Frankly, I think we need to think a lot harder about how we want to be able to index this sort of data. The proposed hstore operators appear to me to be at best just scratching the surface of that. I'd like to be able to index jsonb's # and # operators, for example. Unanchored subpath operators could be an area that's interesting to implement and index. I'm sure that's true, but it's not our immediate concern. We need to think very hard about it to get everything we want, but we also need to think somewhat harder about it in order to get even a basic jsonb type committed. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On Mon, Mar 3, 2014 at 3:46 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-01 13:29:18 +0530, Amit Kapila wrote: With new patch, the message while updating locked rows will be displayed as below: LOG: process 364 still waiting for ShareLock on transaction 678 after 1014.000ms CONTEXT: while attempting to lock tuple (0,2) with values (2) in relation publ ic.t1 of database postgres LOG: process 364 acquired ShareLock on transaction 678 after 60036.000 ms CONTEXT: while attempting to lock tuple (0,2) with values (2) in relation publ ic.t1 of database postgres Now I am not sure, if the second message is an improvement, as what it sounds is while attempting to lock tuple, it got shared lock on transaction'. If you, Robert or other feels it is okay, then we can retain it as it is in patch else I think either we need to rephrase it or may be try with some other way (global variable) such that it appears only for required case. I feel the way Robert has suggested i.e to make it as Detail of particular message (we might need to use global variable to pass certain info) is better way and will have minimal impact on the cases where this additional information needs to be displayed. I really don't understand the origins of your arguments here. Why shouldn't a row lock caused by an UPDATE be relevant? The point I am trying to say is about below part of statement in Context message: while attempting to lock tuple (0,2) . In above situation, we are actually trying to acquire a lock on transaction to operate on a tuple, so I think it will be better to rephrase it (may be by using 'operate on' instead of 'lock'). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 3, 2014 at 7:39 PM, Peter Geoghegan p...@heroku.com wrote: But that's really just a start. Frankly, I think we need to think a lot harder about how we want to be able to index this sort of data. The proposed hstore operators appear to me to be at best just scratching the surface of that. I'd like to be able to index jsonb's # and # operators, for example. Unanchored subpath operators could be an area that's interesting to implement and index. I'm sure that's true, but it's not our immediate concern. We need to think very hard about it to get everything we want, but we also need to think somewhat harder about it in order to get even a basic jsonb type committed. By the way, I think it would be fine to defer adding many of the new hstore operators and functions until 9.5 (as hstore infrastructure, or in-core jsonb infrastructure, or anything else), if you felt you had to, provided that you included just those sufficient to create jsonb operator classes (plus the operator classes themselves, of course). There is absolutely no question about having to do this for B-Tree...why not go a couple of operator classes further? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Mon, Mar 3, 2014 at 2:40 PM, Hannu Krosing ha...@2ndquadrant.com wrote: On 03/03/2014 05:22 PM, Tom Lane wrote: Stephen Frost sfr...@snowman.net writes: ... ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. IIRC, the reason you'd have to do that is to make the table contents appear on slave servers. If you don't consider replication then it might seem easier. So switch on logging and then perform CLUSTER/VACUUM FULL ? Should this work, or is something extra needed ? Today I do something like that: 1) create unlogged table tmp_foo ... 2) populate 'tmp_foo' table (ETL scripts or whatever) 3) start transaction 4) lock table tmp_foo in access exclusive mode 5) update pg_class set relpersistence = 'p' where oid = 'tmp_foo':regclass 6) drop table foo; -- the old foo table 7) alter table tmp_foo rename to foo; 8) end transaction 9) run pg_repack in table 'foo' I know it's very ugly, but works... and works for standbys too... :-) Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
On Mon, Mar 3, 2014 at 9:13 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote: Erm, my thought was to use a select() loop which sends out I/O requests and then loops around waiting to see who finishes it. It doesn't parallelize the CPU cost of getting the rows back to the caller, but it'd at least parallelize the I/O, and if what's underneath is actually a remote FDW running a complex query (because the other side is actually a view), it would be a massive win to have all the remote FDWs executing concurrently instead of serially as we have today. I can't really make sense of this. Sorry, that was a bit hand-wavey since I had posted about it previously here: http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net It'd clearly be more involved than just build a select() loop and would require adding an async mechanism. I had been thinking about this primairly with the idea of FDWs and you're right that it'd require more thought to deal with getting data into/through shared_buffers. Still, we seqscan into a ring buffer, I'd think we could make it work but it would require additional work. For FDWs, one idea might be to kick off the remote query at ExecInitNode() time rather than ExecProcNode() time, at least if the remote query doesn't depend on parameters that aren't available until run time. Right, I had speculated about that also (option #2 in my earlier email). During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries to foreign servers, those would be fired while EXPLAINing a query as well. We want to avoid that. Instead, we can run EXPLAIN on that query at foreign server. But again, not all foreign servers would be able to EXPLAIN the query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(), if it's for EXPLAIN (except for ANALYSE may be). That actually would allow multiple remote queries to run simultaneously or in parallel with local work. It would also run them in cases where the relevant plan node is never executed, which would be bad but perhaps rare enough not to worry about. This was my primary concern, along with the fact that we explicitly says don't do that in the docs for the FDW API. Or we could add a new API like ExecPrefetchNode() that tells nodes to prepare to have tuples pulled, and they can do things like kick off asynchronous queries. But I still don't see any clean way for the Append node to find out which one is ready to return results first. Yeah, that's tricky. Thanks, Stephen -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company