Re: [HACKERS] Vitesse DB call for testing
Hi Mark, Vitesse DB won't be open-sourced, or it would have been a contrib module to postgres. We should take further discussions off this list. People should contact me directly if there is any questions. Thanks, ck...@vitessedata.com On Fri, Oct 17, 2014 at 10:55 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 18/10/14 07:13, Josh Berkus wrote: CK, Before we go any further on this, how is Vitesse currently licensed? last time we talked it was still proprietary. If it's not being open-sourced, we likely need to take discussion off this list. +1 Guys, you need to 'fess up on the licensing! Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 18/10/14 06:13, MauMau wrote: [requirement] 10.2.3 Verify access to all audit trails is logged. Malicious users often attempt to alter audit logs to hide their actions, and a record of access allows an organization to trace any inconsistencies or potential tampering of the logs to an individual account. Having access to logs identifying changes, additions, and deletions can help retrace steps made by unauthorized personnel. [my comment] I'm totally unsure how this can be fulfilled. This is more matter of configuration of the whole system than something pg_audit itself needs to care about (see my answer to 10.5). [requirement] 10.3 Record at least the following audit trail entries for all system components for each event: 10.3.4 Verify success or failure indication is included in log entries. 10.3.5 Verify origination of event is included in log entries. [my comment] This doesn't seem to be fulfilled because the failure of SQL statements and the client address are not part of the audit log entry. You can add it to the log_line_prefix though. [requirement] 10.5 Secure audit trails so they cannot be altered. 10.5 Interview system administrators and examine system configurations and permissions to verify that audit trails are secured so that they cannot be altered as follows: 10.5.1 Only individuals who have a job-related need can view audit trail files. Adequate protection of the audit logs includes strong access control (limit access to logs based on “need to know” only), and use of physical or network segregation to make the logs harder to find and modify. Promptly backing up the logs to a centralized log server or media that is difficult to alter keeps the logs protected even if the system generating the logs becomes compromised. 10.5.2 Protect audit trail files from unauthorized modifications. [my comment] I don't know how to achieve these, because the DBA (who starts/stops the server) can modify and delete server log files without any record. Logging can be setup in a way that it's not even stored on the server which DBA has access to. DBA can turn off logging (and the plugin) altogether or change logging config but we'd get the shutdown log when that happens so 10.2.2 and 10.2.6 will be fulfilled in that case I think. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 18 October 2014 05:13, MauMau maumau...@gmail.com wrote: [requirement] 10.6 Review logs and security events for all system components to identify anomalies or suspicious activity. Note: Log harvesting, parsing, and alerting tools may be used to meet this Requirement. The log review process does not have to be manual. The use of log harvesting, parsing, and alerting tools can help facilitate the process by identifying log events that need to be reviewed. [my comment] What commercial and open source products are well known as the log harvesting, parsing, and alerting tool? Is it possible and reasonably easy to integrate pgaudit with those tools? The purpose of audit logging feature is not recording facts, but to enable timely detection of malicious actions. So, I think the ease of integration with those tools must be evaluated. But I don't know about such tools. I feel the current output format of pgaudit is somewhat difficult to treat: * The audit log entries are mixed with other logs in the server log files, so the user has to extract the audit log lines from the server log files and save them elsewhere. I think it is necessary to store audit logs in separate files. * Does the command text need around it in case it contains commas? Audit entries are sent to the server log, yes. The server log may be redirected to syslog, which allows various forms of routing and manipulation that are outside of the reasonable domain of pgaudit. PostgreSQL also provides a logging hook that would allow you to filter or redirect messages as desired. Given those two ways of handling server log messages, the server log is the obvious destination to provide for the recording/loggin part of the audit requirement. pgaudit is designed to allow generating useful messages, not be an out of the box compliance tool. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizer on sort aggregate
On Sat, Oct 18, 2014 at 3:10 AM, Peter Geoghegan p...@heroku.com wrote: So the first eight bytes of the first string is 0x131F1F1B221E, and the second 0x131F1F1B220C. The last byte is different. That's interesting but I think it's mostly a quirk of your example. Afaics the difference is only that the en_US locale ignores punctuation like : and / (or at least treats them as less significant than alphabetic characters). If you had strings that had less punctuation or differences that didn't happen to arrive shortly after the 8-byte boundary then it wouldn't make any difference. And we still have to run strfrm at least once, write out the whole binary blob to memory somewhere and if it spills to disk we still have to write and read much more data. I think recognizing cases where equality is the only thing we're interested in and locale-sensitive sorting isn't necessary and using a memcmp would be a clear win. I'm not immediately clear on what the cleanest way to integrate it would be. A btree opclass function like the cmp function but that doesn't need to be consistent with and , only = ? Or perhaps a flag on the btree opclass that indicates that the data types can safely be compared with memcmp when equality is all that's needed? The latter is pretty tempting since it would tell code something interesting about the data type's internal storage that may lead to other optimizations. On the other hand the former is nice in that the operator could maybe handle other cases like padding by doing memcmp on only the significant bits. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views don't show up in information_schema
2014-10-18 Stephen Frost sfr...@snowman.net: * Peter Eisentraut (pete...@gmx.net) wrote: More subtly, if we claim that a materialized view is a view, then we cannot have asynchronously updated materialized views, because then we have different semantics. This is, at least, a reason I can understand, though I'm not sure I see it as sufficient to say matviews are so different from views that they shouldn't be listed as such. Maybe it's useful to try to imagine who the actual consumers of INFORMATION_SCHEMA are? It's probably mostly generic SQL tools that try to provide a list of objects with corresponding operations (drop, alter, dump to a file, show contents, etc) or tools that provide syntax completion. I can only imagine two practical (though rather hypothetical) problems caused by the semantical mismatch between normal views and possibly-outdated matviews: (1) Accidentally seeing outdated data: Something tries to replace part of a query with a reference to a matview, because INFORMATION_SCHEMA says that the definition of the view is such-and-so. This causes the resulting query to possibly see outdated data. (2) Accidentally seeing data that is too new: Something replaces a reference to a matview in a query with the defining query of the matview. This causes the resulting query to possibly see data that is too new, assuming that the original query author is trying to rely on the outdatedness of the matview in the vein of I want to see yesterday's data. I personally consider relying on the outdatedness of a matview to be bad design; Maybe that should be mentioned in the documentation if I'm not the only one thinking that way. (Note that (2) also happens when a generic SQL tool regenerates a schema by recreating a matview as a normal view. The resulting normal view seems to contain data that is too new.) Those problems sound so far-fetched, that I suggest putting matviews (even though they may be out-of-date) in INFORMATION_SCHEMA.VIEWS as if they were normal views, so that in all other use cases (i.e., the abovementioned generic SQL tools), the right thing happens. It is probably useful to put them in INFORMATION_SCHEMA.TABLES with a specialized type MATERIALIZED VIEW (or somesuch), so that tools that know about the existence of matviews know how to make the difference. Does someone know what other DBMSs do in this regard? I.e., do they put anything in INFORMATION_SCHEMA.VIEWS for matviews? What TABLE_TYPE do they use in INFORMATION_SCHEMA.TABLES? Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On Fri, Oct 17, 2014 at 11:03:04PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: On Fri, Oct 17, 2014 at 06:15:37PM -0400, Tom Lane wrote: Those stats were perfectly valid: what the planner is looking for is accurate minimum and maximum values for the index's leading column, and that's what it got. You're correct that a narrower index could have given the same results with a smaller disk footprint, but the planner got the results it needed from the index you provided for it to work with. Uh, why is the optimizer looking at the index on a,b,c and not just the stats on column a, for example? I am missing something here. Because it needs up-to-date min/max values in order to avoid being seriously misled about selectivities of values near the endpoints. See commit 40608e7f949fb7e4025c0ddd5be01939adc79eec. Oh, I had forgotten we did that. It is confusing that there is no way via EXPLAIN to see the access, making the method of consulting pg_stat_* and using EXPLAIN unreliable. Should we document this somewhere? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Code bug or doc bug?
On Mon, Oct 13, 2014 at 12:17:54PM -0400, Bruce Momjian wrote: On Wed, Aug 27, 2014 at 06:39:21AM -0700, David G Johnston wrote: Is there a doc patch to make here? 1. Last sentence change suggestion: The target tablespace must be empty. 2. Based on Robert's comments it sounds like a You cannot change the default tablespace of the current database. comment should be added as well. Side note: I have no clue what the mapped relations Robert refers to are... I have created the attached doc patch for this. Should we backpatch this through 9.0, or just 9.4? Applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/18/14, 5:16 AM, Tom Lane wrote: BTW, on re-reading that code I notice that it will happily seize upon the first suitable index (first in OID order), regardless of how many lower-order columns that index has got. This doesn't make any difference I think for get_actual_variable_range's own purposes, because it's only expecting to touch the endmost index page regardless. However, in light of Marko's complaint maybe we should teach it to check all the indexes and prefer the matching one with fewest columns? It would only take a couple extra lines of code, and probably not that many added cycles considering we're going to do an index access of some sort. But I'm not sure if it's worth any extra effort --- I think in his example case, there wasn't any narrower index anyway. Perhaps accidentally this would have helped in my case, actually, since I could have created a new, smaller index CONCURRENTLY and then seen that the usage of the other index stopped increasing. With the pick the smallest OID behaviour that was not possible. Another idea had was some way to tell the optimizer not to use that particular index for stats lookups, but probably the use case for such a feature would be a bit narrow. All that said, I don't think my struggles justify the change you described above. Not sure if it's a good idea or not. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] interval typmodout is broken
On Mon, Oct 13, 2014 at 07:38:39PM -0400, Bruce Momjian wrote: I think the basic problem is that the original author had the idea of doing: SELECT INTERVAL (2) '100. seconds'; interval -- 00:01:41 and using (2) in that location as a short-hand when the interval precision units were not specified, which seems logical. However, they allowed it even when the units were specified: SELECT INTERVAL (2) '100. seconds' HOUR to SECOND; interval -- 00:01:41 and in cases where the precision made no sense: SELECT INTERVAL (2) '100. seconds' HOUR to MINUTE; interval -- 00:01:00 I have created the attached patch which only allows parentheses in the first case. Patch applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On Sat, Oct 18, 2014 at 04:29:41PM +0200, Marko Tiikkaja wrote: On 10/18/14, 5:16 AM, Tom Lane wrote: BTW, on re-reading that code I notice that it will happily seize upon the first suitable index (first in OID order), regardless of how many lower-order columns that index has got. This doesn't make any difference I think for get_actual_variable_range's own purposes, because it's only expecting to touch the endmost index page regardless. However, in light of Marko's complaint maybe we should teach it to check all the indexes and prefer the matching one with fewest columns? It would only take a couple extra lines of code, and probably not that many added cycles considering we're going to do an index access of some sort. But I'm not sure if it's worth any extra effort --- I think in his example case, there wasn't any narrower index anyway. Perhaps accidentally this would have helped in my case, actually, since I could have created a new, smaller index CONCURRENTLY and then seen that the usage of the other index stopped increasing. With the pick the smallest OID behaviour that was not possible. Another idea had was some way to tell the optimizer not to use that particular index for stats lookups, but probably the use case for such a feature would be a bit narrow. Well, if the index is there, why not use it? I thought the problem was just that you had no visibility into how those statistics were being accessed. Most people think EXPLAIN shows all accesses, but obviously now it doesn't. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash index creation warning
On Fri, Oct 17, 2014 at 02:36:55PM -0400, Bruce Momjian wrote: On Fri, Oct 17, 2014 at 12:56:52PM -0400, Tom Lane wrote: David G Johnston david.g.johns...@gmail.com writes: The question is whether we explain the implications of not being WAL-logged in an error message or simply state the fact and let the documentation explain the hazards - basically just output: hash indexes are not WAL-logged and their use is discouraged +1. The warning message is not the place to be trying to explain all the details. OK, updated patch attached. Patch applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/18/14, 4:33 PM, Bruce Momjian wrote: On Sat, Oct 18, 2014 at 04:29:41PM +0200, Marko Tiikkaja wrote: Another idea had was some way to tell the optimizer not to use that particular index for stats lookups, but probably the use case for such a feature would be a bit narrow. Well, if the index is there, why not use it? I thought the problem was just that you had no visibility into how those statistics were being accessed. Yes, exactly; if I had had the option to disable the index from the optimizer's point of view, I'd have seen that it's not used for looking up any data by any queries, and thus I would have known that I can safely drop it without slowing down queries. Which was the only thing I cared about, and where the stats we provide failed me. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views don't show up in information_schema
Does someone know what other DBMSs do in this regard? I.e., do they put anything in INFORMATION_SCHEMA.VIEWS for matviews? What TABLE_TYPE do they use in INFORMATION_SCHEMA.TABLES? I can only speak for Oracle. Oracle doesn't have INFORMATION_SCHEMA but their JDBC driver treats mviews as tables. If you ask the driver for a list of tables (explicitly passing TABLE as the only type to return), it will include mviews as well, returning MATERIALIZED VIEW as the type of the object. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Materialized-views-don-t-show-up-in-information-schema-tp5822643p5823559.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On Sat, Oct 18, 2014 at 04:38:37PM +0200, Marko Tiikkaja wrote: On 10/18/14, 4:33 PM, Bruce Momjian wrote: On Sat, Oct 18, 2014 at 04:29:41PM +0200, Marko Tiikkaja wrote: Another idea had was some way to tell the optimizer not to use that particular index for stats lookups, but probably the use case for such a feature would be a bit narrow. Well, if the index is there, why not use it? I thought the problem was just that you had no visibility into how those statistics were being accessed. Yes, exactly; if I had had the option to disable the index from the optimizer's point of view, I'd have seen that it's not used for looking up any data by any queries, and thus I would have known that I can safely drop it without slowing down queries. Which was the only thing I cared about, and where the stats we provide failed me. How many other cases do we have where the statistics are getting incremented and there is no user visibility into the operation? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
Bruce Momjian br...@momjian.us writes: How many other cases do we have where the statistics are getting incremented and there is no user visibility into the operation? * system catalog accesses * vacuum/analyze/cluster/etc The fact that system-initiated accesses get counted in the statistics is a feature, not a bug. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
Marko Tiikkaja ma...@joh.to writes: On 10/18/14, 4:33 PM, Bruce Momjian wrote: Well, if the index is there, why not use it? I thought the problem was just that you had no visibility into how those statistics were being accessed. Yes, exactly; if I had had the option to disable the index from the optimizer's point of view, I'd have seen that it's not used for looking up any data by any queries, and thus I would have known that I can safely drop it without slowing down queries. Which was the only thing I cared about, and where the stats we provide failed me. This argument is *utterly* wrongheaded, because it assumes that the planner's use of the index provided no benefit to your queries. If the planner was touching the index at all then it was planning queries in which knowledge of the extremal value was relevant to accurate selectivity estimation. So it's quite likely that without the index you'd have gotten different and inferior plans, whether or not those plans actually chose to use the index. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/18/14, 5:46 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: Yes, exactly; if I had had the option to disable the index from the optimizer's point of view, I'd have seen that it's not used for looking up any data by any queries, and thus I would have known that I can safely drop it without slowing down queries. Which was the only thing I cared about, and where the stats we provide failed me. This argument is *utterly* wrongheaded, because it assumes that the planner's use of the index provided no benefit to your queries. If the planner was touching the index at all then it was planning queries in which knowledge of the extremal value was relevant to accurate selectivity estimation. So it's quite likely that without the index you'd have gotten different and inferior plans, whether or not those plans actually chose to use the index. Maybe. But at the same time that's a big problem: there's no way of knowing whether the index is actually useful or not when it's used only by the query planner. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FieldSelect optimization versus overall planner organization
I looked into bug #11703, http://www.postgresql.org/message-id/20141017162236.2523.74...@wrigleys.postgresql.org What is happening here is that the FieldSelect optimization in eval_const_expressions fails to play nice with the logic that determines which Vars need to be available from a relation scan. We start out with, say, select x from my_table, func2(my_table) f(x); which inline_set_returning_functions() expands into select x from my_table, lateral (select func1(my_table)) f(x); At this point we decide that the sub-select can't be flattened into the upper query (in this case, because it lacks a FROM clause, but any other reason to suppress flattening would have the same result). So we put the sub-select on the back burner for the moment and start to plan the outer query. In particular, we determine that the only Var that the scan of my_table has to produce is a whole-row Var (my_table.*) to satisfy the lateral reference in the sub-select. Eventually, we get around to recursively planning the sub-query, and at that point we'll inline the func1() call, producing a FieldSelect applied to the whole-row Var for mytable. When we do const-simplification on the resulting expression, that pair of nodes is reduced to a simple Var referencing mytable.my_column. Which ought to be fine --- except that we already planned the scan of my_table on the assumption that it should produce my_table.*. So after all the dust settles and setrefs.c is doing the janitorial work, it finds out that the lateral reference to mytable.my_column can't be satisfied, and you get the dreaded variable not found in subplan target list error. AFAICS, the only way to fix this right would require major rearrangements of the order in which the planner does stuff, for example insisting on doing function inlining and const-simplification all the way down in subqueries before we plan the outer query level. That'd be quite a lot of work to change, and the odds of introducing new bugs are high enough that I think back-patching would be out of the question anyway. I think the only practical fix is to lobotomize the FieldSelect-collapsing optimization. It's probably safe enough to keep allowing it for whole-row Vars of the current query level, but not those with levelsup 0. Anybody have another idea? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of GetUserId() Usage
All, I'll break them into three pieces- superuser() cleanup, GetUserId() - has_privs_of_role(), and the additional-role-attributes patch will just depend on the others. Attached is a patch for the GetUserId() - has_privs_of_role() cleanup for review. -Adam -- Adam Brightwell - adam.brightw...@crunchydatasolutions.com Database Engineer - www.crunchydatasolutions.com diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c new file mode 100644 index 67539ec..42d9a1f *** a/src/backend/utils/adt/misc.c --- b/src/backend/utils/adt/misc.c *** *** 34,39 --- 34,40 #include storage/pmsignal.h #include storage/proc.h #include storage/procarray.h + #include utils/acl.h #include utils/lsyscache.h #include utils/ruleutils.h #include tcop/tcopprot.h *** pg_signal_backend(int pid, int sig) *** 113,119 return SIGNAL_BACKEND_ERROR; } ! if (!(superuser() || proc-roleId == GetUserId())) return SIGNAL_BACKEND_NOPERMISSION; /* --- 114,120 return SIGNAL_BACKEND_ERROR; } ! if (!has_privs_of_role(GetUserId(), proc-roleId)) return SIGNAL_BACKEND_NOPERMISSION; /* diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c new file mode 100644 index 44ccd37..ea2cd1e *** a/src/backend/utils/adt/pgstatfuncs.c --- b/src/backend/utils/adt/pgstatfuncs.c *** *** 20,25 --- 20,26 #include libpq/ip.h #include miscadmin.h #include pgstat.h + #include utils/acl.h #include utils/builtins.h #include utils/inet.h #include utils/timestamp.h *** pg_stat_get_activity(PG_FUNCTION_ARGS) *** 675,681 nulls[15] = true; /* Values only available to same user or superuser */ ! if (superuser() || beentry-st_userid == GetUserId()) { SockAddr zero_clientaddr; --- 676,682 nulls[15] = true; /* Values only available to same user or superuser */ ! if (has_privs_of_role(GetUserId(), beentry-st_userid)) { SockAddr zero_clientaddr; *** pg_stat_get_backend_activity(PG_FUNCTION *** 877,883 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) activity = backend information not available; ! else if (!superuser() beentry-st_userid != GetUserId()) activity = insufficient privilege; else if (*(beentry-st_activity) == '\0') activity = command string not enabled; --- 878,884 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) activity = backend information not available; ! else if (!has_privs_of_role(GetUserId(), beentry-st_userid)) activity = insufficient privilege; else if (*(beentry-st_activity) == '\0') activity = command string not enabled; *** pg_stat_get_backend_waiting(PG_FUNCTION_ *** 898,904 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!superuser() beentry-st_userid != GetUserId()) PG_RETURN_NULL(); result = beentry-st_waiting; --- 899,905 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!has_privs_of_role(GetUserId(), beentry-st_userid)) PG_RETURN_NULL(); result = beentry-st_waiting; *** pg_stat_get_backend_activity_start(PG_FU *** 917,923 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!superuser() beentry-st_userid != GetUserId()) PG_RETURN_NULL(); result = beentry-st_activity_start_timestamp; --- 918,924 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!has_privs_of_role(GetUserId(), beentry-st_userid)) PG_RETURN_NULL(); result = beentry-st_activity_start_timestamp; *** pg_stat_get_backend_xact_start(PG_FUNCTI *** 943,949 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!superuser() beentry-st_userid != GetUserId()) PG_RETURN_NULL(); result = beentry-st_xact_start_timestamp; --- 944,950 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!has_privs_of_role(GetUserId(), beentry-st_userid)) PG_RETURN_NULL(); result = beentry-st_xact_start_timestamp; *** pg_stat_get_backend_start(PG_FUNCTION_AR *** 965,971 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!superuser() beentry-st_userid != GetUserId()) PG_RETURN_NULL(); result = beentry-st_proc_start_timestamp; --- 966,972 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!has_privs_of_role(GetUserId(), beentry-st_userid)) PG_RETURN_NULL(); result = beentry-st_proc_start_timestamp; *** pg_stat_get_backend_client_addr(PG_FUNCT *** 989,995 if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL) PG_RETURN_NULL(); ! if (!superuser() beentry-st_userid !=
[HACKERS] (auto-)analyze causing bloat/load
Hi, I've noticed an interesting phenomenon. Look at the following pgbench -c 32 -j 32 -P 1 output on a scale 200 database. Then guess where I've started SET vacuum_cost_delay = '20ms'; ANALYZE pgbench_accounts; and when it finished. The cost_delay is to simulate a autovac triggered analyze. progress: 18.0 s, 17018.1 tps, lat 1.879 ms stddev 1.018 progress: 19.0 s, 17039.6 tps, lat 1.876 ms stddev 0.806 progress: 20.0 s, 16892.1 tps, lat 1.893 ms stddev 0.732 progress: 21.0 s, 16636.9 tps, lat 1.922 ms stddev 0.933 progress: 22.0 s, 16198.4 tps, lat 1.974 ms stddev 1.180 progress: 23.0 s, 16362.1 tps, lat 1.954 ms stddev 0.858 progress: 24.0 s, 15926.0 tps, lat 2.007 ms stddev 0.967 progress: 25.0 s, 14550.5 tps, lat 2.197 ms stddev 1.870 progress: 26.0 s, 15042.2 tps, lat 2.125 ms stddev 1.135 progress: 27.0 s, 14579.1 tps, lat 2.196 ms stddev 1.016 progress: 28.0 s, 14145.0 tps, lat 2.261 ms stddev 1.160 progress: 29.0 s, 13734.1 tps, lat 2.328 ms stddev 0.992 progress: 30.0 s, 12127.6 tps, lat 2.565 ms stddev 1.930 progress: 31.0 s, 12375.2 tps, lat 1.807 ms stddev 1.556 progress: 32.0 s, 11939.9 tps, lat 3.551 ms stddev 45.457 progress: 33.0 s, 12466.0 tps, lat 2.566 ms stddev 1.517 progress: 34.0 s, 12166.1 tps, lat 2.634 ms stddev 1.309 progress: 35.0 s, 11743.7 tps, lat 2.723 ms stddev 1.576 progress: 36.0 s, 11623.3 tps, lat 2.752 ms stddev 1.237 progress: 37.0 s, 11341.4 tps, lat 2.817 ms stddev 1.413 progress: 38.0 s, 11130.2 tps, lat 2.875 ms stddev 1.359 progress: 39.0 s, 10988.0 tps, lat 2.910 ms stddev 1.132 progress: 40.0 s, 10681.9 tps, lat 2.993 ms stddev 1.355 progress: 41.0 s, 10573.1 tps, lat 3.027 ms stddev 1.738 progress: 42.0 s, 10174.9 tps, lat 3.143 ms stddev 2.125 progress: 43.0 s, 10190.1 tps, lat 3.136 ms stddev 1.892 progress: 44.0 s, 9943.2 tps, lat 3.219 ms stddev 1.740 progress: 45.0 s, 9792.5 tps, lat 3.265 ms stddev 2.229 progress: 46.0 s, 9701.6 tps, lat 3.299 ms stddev 1.834 progress: 47.0 s, 9376.6 tps, lat 3.410 ms stddev 2.149 progress: 48.0 s, 9217.2 tps, lat 3.471 ms stddev 1.992 progress: 49.0 s, 9265.9 tps, lat 3.449 ms stddev 1.819 progress: 50.0 s, 9090.9 tps, lat 3.521 ms stddev 1.916 progress: 51.0 s, 8933.9 tps, lat 3.578 ms stddev 1.813 progress: 52.0 s, 8849.6 tps, lat 3.616 ms stddev 1.913 progress: 53.0 s, 8767.5 tps, lat 3.649 ms stddev 1.952 progress: 54.0 s, 8521.5 tps, lat 3.753 ms stddev 2.158 progress: 55.0 s, 8382.2 tps, lat 3.816 ms stddev 2.383 progress: 56.0 s, 8298.1 tps, lat 3.848 ms stddev 2.727 progress: 57.0 s, 8369.5 tps, lat 3.825 ms stddev 2.897 progress: 58.0 s, 8205.0 tps, lat 3.902 ms stddev 2.669 progress: 59.0 s, 8048.1 tps, lat 3.974 ms stddev 2.473 progress: 60.0 s, 7886.5 tps, lat 4.050 ms stddev 2.456 progress: 61.0 s, 7539.0 tps, lat 4.251 ms stddev 2.992 progress: 62.0 s, 7815.4 tps, lat 4.090 ms stddev 2.845 progress: 63.0 s, 7863.4 tps, lat 4.062 ms stddev 3.032 progress: 64.0 s, 7750.6 tps, lat 4.135 ms stddev 2.677 progress: 65.0 s, 7580.7 tps, lat 4.217 ms stddev 2.210 progress: 66.0 s, 7164.8 tps, lat 4.466 ms stddev 2.511 progress: 67.0 s, 7504.4 tps, lat 4.265 ms stddev 1.343 progress: 68.0 s, 9530.0 tps, lat 3.360 ms stddev 1.742 progress: 69.0 s, 16751.3 tps, lat 1.909 ms stddev 0.652 progress: 70.0 s, 16123.7 tps, lat 1.983 ms stddev 0.751 progress: 71.0 s, 15122.7 tps, lat 2.115 ms stddev 1.065 Yes. That's right. ANALYZE caused a slowdown of nearly *2. Executing it on a bigger database where it takes longer makes it much worse. Also, a nonuniform access pattern also makes it *much* worse. So, you might ask, why is that? It's because ANALYZE keeps a snapshot around and thus blocks RecentGlobalXmin from advancing. Leading to hot pruning being impossible on popular pages. Now, the update rates in this example certainly are above what most sites see, but in practical sites relations are also much larger. And thus autoanalyze also takes longer... Much longer with a large sample size. Interestingly we already set PROC_IN_ANALYZE - but we never actually look at it. I wonder what it'd take to ignore analyze in GetSnapshotData()/GetOldestXmin()... At first sight it looks quite doable to not have a snapshot around (or setup to be ignored) while acquire_sample_rows() is running. As that's the expensive bit... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizer on sort aggregate
On Sat, Oct 18, 2014 at 5:27 AM, Greg Stark st...@mit.edu wrote: That's interesting but I think it's mostly a quirk of your example. Afaics the difference is only that the en_US locale ignores punctuation like : and / (or at least treats them as less significant than alphabetic characters). If you had strings that had less punctuation or differences that didn't happen to arrive shortly after the 8-byte boundary then it wouldn't make any difference. The en_US locale treats those kind of punctuation characters as less significant than alphabetical characters, but is not at all unusual in doing so - all locales I tested do the same. They also do the same for spaces. And there is another property of the transformation that is very important for those outside of the English speaking world - diacritics are not represented at the primary weight level. So even though representing certain characters with a diacritic will take 2 bytes in UTF-8, the corresponding primary weight only takes 1 byte. In short, as I said, the concentration of entropy can easily be a lot higher within the first n bytes of the primary weight level of the transformed blob as compared to the first n bytes of the original string, and frequently *will* be. This will make worst cases for abbreviated keys significantly less common than they would otherwise be, while more generally improving performance. Of course, that might not always be as effective as we'd prefer, but that's something that you more or less have to live with in order to have competitive sort performance (even with the HyperLogLog amelioration, you still pay something for considering the technique). You can always contrive a case that puts things just out of reach, no matter how much entropy you manage to concentrate in the abbreviated key. Fortunately, I don't think those cases are all that representative of what people want from sort performance. And we still have to run strfrm at least once, write out the whole binary blob to memory somewhere and if it spills to disk we still have to write and read much more data. I think recognizing cases where equality is the only thing we're interested in and locale-sensitive sorting isn't necessary and using a memcmp would be a clear win. Yeah. I was making a point that strictly concerned abbreviated keys as proposed in response to Feng's remarks. I am not 100% sure that the benefits of abbreviated keys with locale support aren't worth it here, and I say that in full agreement with Feng about locale-aware sorts not actually being necessary. It's clear that cache performance is essential to getting good sort performance, which strongly recommends abbreviated keys. When we consider the concentration of entropy with ordinary locale-aware abbreviated keys, as compared to abbreviated keys that just use the C locale artificially for this case, it's not clear that the concentration of entropy isn't reason enough to prefer locale aware abbreviated keys. Now, it might well be that paying for n strxfrm() operations, rather than doing n straight-up memcpy() operations isn't worth it. But I think it might well be worth it - particularly when you factor in the complexity avoided by not special-casing this. I'm not really sure. The general idea of abbreviated keys is almost old hat, to be honest. It just happens to be essential for competitive sort performance. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On Sat, Oct 18, 2014 at 06:15:03PM +0200, Marko Tiikkaja wrote: On 10/18/14, 5:46 PM, Tom Lane wrote: Marko Tiikkaja ma...@joh.to writes: Yes, exactly; if I had had the option to disable the index from the optimizer's point of view, I'd have seen that it's not used for looking up any data by any queries, and thus I would have known that I can safely drop it without slowing down queries. Which was the only thing I cared about, and where the stats we provide failed me. This argument is *utterly* wrongheaded, because it assumes that the planner's use of the index provided no benefit to your queries. If the planner was touching the index at all then it was planning queries in which knowledge of the extremal value was relevant to accurate selectivity estimation. So it's quite likely that without the index you'd have gotten different and inferior plans, whether or not those plans actually chose to use the index. Maybe. But at the same time that's a big problem: there's no way of knowing whether the index is actually useful or not when it's used only by the query planner. That is a good point. Without an index, the executor is going to do a sequential scan, while a missing index to the optimizer just means worse statistics. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] initdb failure on RH 5.10
Hi, One of our sites ran into a problem with database installation: initdb failed FATAL: unexpected data beyond EOF in block 19 of relation base/1/2609 HINT: This has been seen to occur with buggy kernels; consider updating your system. STATEMENT: COMMENT ON FUNCTION euc_jis_2004_to_shift_jis_2004 (INTEGER, INTEGER, CSTRING, INTERNAL, INTEGER) IS 'internal conversion function for EUC_JIS_2004 to SHIFT_JIS_2004'; initdb is called like this: initdb -D data-dir -L input-dir -E UTF8 --locale=C This is Postgres 8.4.4, the installation piece has been stable and always worked, but this time they have a new Red Hat 5.10 server # uname -a Linux hostname 2.6.18-371.el5 #1 SMP Thu Sep 5 21:21:44 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux # cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.10 (Tikanga) I am not sure if this is helpful, but just in case: # ldd euc_jis_2004_and_shift_jis_2004.so linux-vdso.so.1 = (0x7fff259fd000) libc.so.6 = /lib64/libc.so.6 (0x2b3d5a756000) /lib64/ld-linux-x86-64.so.2 (0x00384840) And these libs are softlinks: /lib64/libc.so.6 - libc-2.5.so /lib64/ld-linux-x86-64.so.2 - ld-2.5.so The only thought I have at this point is to run it with strace, but maybe this is a known issue and someone has a better idea? Thank you, Michael. This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email. For other languages, go to http://www.3ds.com/terms/email-disclaimer
Re: [HACKERS] (auto-)analyze causing bloat/load
Andres Freund and...@2ndquadrant.com writes: Interestingly we already set PROC_IN_ANALYZE - but we never actually look at it. I wonder what it'd take to ignore analyze in GetSnapshotData()/GetOldestXmin()... At first sight it looks quite doable to not have a snapshot around (or setup to be ignored) while acquire_sample_rows() is running. As that's the expensive bit... Say what? How is ANALYZE supposed to know which rows are valid without a snapshot? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] (auto-)analyze causing bloat/load
On 2014-10-18 14:30:53 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Interestingly we already set PROC_IN_ANALYZE - but we never actually look at it. I wonder what it'd take to ignore analyze in GetSnapshotData()/GetOldestXmin()... At first sight it looks quite doable to not have a snapshot around (or setup to be ignored) while acquire_sample_rows() is running. As that's the expensive bit... Say what? How is ANALYZE supposed to know which rows are valid without a snapshot? I haven't fully investigated it yet. But it's already not using it's own snapshot to determine visibility. Instead it uses GetOldestXmin() + HeapTupleSatisfiesVacuum(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] initdb failure on RH 5.10
BRUSSER Michael michael.brus...@3ds.com writes: initdb failed FATAL: unexpected data beyond EOF in block 19 of relation base/1/2609 HINT: This has been seen to occur with buggy kernels; consider updating your system. STATEMENT: COMMENT ON FUNCTION euc_jis_2004_to_shift_jis_2004 (INTEGER, INTEGER, CSTRING, INTERNAL, INTEGER) IS 'internal conversion function for EUC_JIS_2004 to SHIFT_JIS_2004'; That's ... surprising. We've only ever seen that error message in cases with heavy concurrent updates and wonky underlying storage; initdb is not where anyone would expect it. initdb is called like this: initdb -D data-dir -L input-dir -E UTF8 --locale=C It's not exactly customary to use -L in initdb calls. Is it possible you're pointing it to an incompatible library directory? Not that I see how that would lead to this behavior, but you're definitely dealing with something pretty weird. This is Postgres 8.4.4, the installation piece has been stable and always worked, but this time they have a new Red Hat 5.10 server What in the world are they doing using 8.4.4? The entire 8.4.x release series is out of support anyway, but there is little if any excuse not to be using the last minor release, 8.4.22. I'd call your attention also to the fact that RHEL 5.10 is obsolete. 5.11 came out last month, and Red Hat are not known for updating back-rev release series with inessential bug fixes. If you can still reproduce this with 5.11 and 8.4.22, people might be interested in looking more closely. Otherwise, well, you're dealing with five-year-old software with a very long list of known bugs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg
Hi, On 16.10.2014 03:26, Jeff Davis wrote: On Fri, 2014-08-29 at 10:12 -0400, Tom Lane wrote: What about removing the callback per se and just keeping the argument, as it were. That is, a MemoryContext has a field of type size_t * that is normally NULL, but if set to non-NULL, then we increment the pointed-to value for pallocs and decrement for pfrees. I like that idea; patch attached. Two differences: * At block level, not chunk level. * I use a uint64, because a size_t is only guaranteed to hold one allocation, and we need to sum up many allocations. When unused, it does still appear to have a little overhead in Robert's test on his power machine. It seems to be between a 0.5-1.0% regression. There's not much extra code on that path, just a branch, pointer dereference, and an addition; so I don't think it will get much cheaper than it is. This regression seems harder to reproduce on my workstation, or perhaps it's just noisier. I assume it's the reindex pgbench_accounts_pkey test, correct? I did the same test on my workstation (with i7-4770R CPU), and ISTM there's a measurable difference, although in surprising directions. For binaries compiled with gcc 4.7.3 and clang 3.3 and 3.5, the results look like this (5 runs for each combination): gcc 4.7.3 / master -- LOG: internal sort ended, ... CPU 1.24s/4.32u sec elapsed 9.78 sec LOG: internal sort ended, ... CPU 1.29s/4.31u sec elapsed 9.80 sec LOG: internal sort ended, ... CPU 1.26s/4.31u sec elapsed 9.80 sec LOG: internal sort ended, ... CPU 1.26s/4.37u sec elapsed 9.78 sec LOG: internal sort ended, ... CPU 1.33s/4.29u sec elapsed 9.78 sec gcc 4.7.3 / patched --- LOG: internal sort ended, ... CPU 1.35s/4.27u sec elapsed 9.78 sec LOG: internal sort ended, ... CPU 1.33s/4.30u sec elapsed 9.74 sec LOG: internal sort ended, ... CPU 1.34s/4.27u sec elapsed 9.75 sec LOG: internal sort ended, ... CPU 1.27s/4.26u sec elapsed 9.78 sec LOG: internal sort ended, ... CPU 1.35s/4.26u sec elapsed 9.78 sec clang 3.3 / master -- LOG: internal sort ended, ... CPU 1.32s/4.61u sec elapsed 10.00 sec LOG: internal sort ended, ... CPU 1.27s/4.48u sec elapsed 9.95 sec LOG: internal sort ended, ... CPU 1.35s/4.48u sec elapsed 9.99 sec LOG: internal sort ended, ... CPU 1.32s/4.49u sec elapsed 9.97 sec LOG: internal sort ended, ... CPU 1.32s/4.47u sec elapsed 10.04 sec clang 3.3 / patched --- LOG: internal sort ended, ... CPU 1.35s/4.59u sec elapsed 10.13 sec LOG: internal sort ended, ... CPU 1.31s/4.61u sec elapsed 10.06 sec LOG: internal sort ended, ... CPU 1.28s/4.63u sec elapsed 10.10 sec LOG: internal sort ended, ... CPU 1.27s/4.58u sec elapsed 10.01 sec LOG: internal sort ended, ... CPU 1.29s/4.60u sec elapsed 10.05 sec clang 3.5 / master -- LOG: internal sort ended, ... CPU 1.30s/4.46u sec elapsed 9.96 sec LOG: internal sort ended, ... CPU 1.30s/4.49u sec elapsed 9.96 sec LOG: internal sort ended, ... CPU 1.30s/4.53u sec elapsed 9.95 sec LOG: internal sort ended, ... CPU 1.25s/4.51u sec elapsed 9.95 sec LOG: internal sort ended, ... CPU 1.30s/4.50u sec elapsed 9.95 sec clang 3.5 / patched --- LOG: internal sort ended, ... CPU 1.32s/4.59u sec elapsed 9.97 sec LOG: internal sort ended, ... CPU 1.27s/4.49u sec elapsed 9.91 sec LOG: internal sort ended, ... CPU 1.29s/4.48u sec elapsed 9.88 sec LOG: internal sort ended, ... CPU 1.31s/4.49u sec elapsed 9.93 sec LOG: internal sort ended, ... CPU 1.23s/4.56u sec elapsed 9.90 sec So while on clang 3.3 I really see about ~1% slowdown, on the other two compilers it seems a tad faster (but it might easily be a noise). It'd be interesting to know what compiler/version Robert used ... One thought in either case is that we don't have to touch the API for MemoryContextCreate: rather, the feature can be enabled in a separate call after making the context. That seems fairly awkward to me because the pointer needs to be inherited from the parent context when not specified. I left the extra API call in. I don't see a big difference between adding a new API create method and a adding a separate enable tracking method. What however seems awkward is using the 'uint64*' directly as a parameter, instead of using 'enable_tracking' flag as in the previous versions. Is there a reason for that? This allows supplying a pointer to a uint64 variable, i.e. something like this: { uint64 mem_used = 0; MemoryContext context = MemoryContextCreateTracked(..., mem_used); ... if (mem_used work_mem * 1024L) { ... do something ... } } Thanks to exposing the total_mem like this, it's possible to get the current value directly like this (without the API methods available in the previous versions). It also makes it possible to detach the accounting from the parent (i.e. not count it against the parent). But is this
Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch
On 10/17/14, 10:16 PM, Tom Lane wrote: I wrote: Because it needs up-to-date min/max values in order to avoid being seriously misled about selectivities of values near the endpoints. See commit 40608e7f949fb7e4025c0ddd5be01939adc79eec. BTW, on re-reading that code I notice that it will happily seize upon the first suitable index (first in OID order), regardless of how many lower-order columns that index has got. This doesn't make any difference I think for get_actual_variable_range's own purposes, because it's only expecting to touch the endmost index page regardless. However, in light of Marko's complaint maybe we should teach it to check all the indexes and prefer the matching one with fewest columns? It would only take a couple extra lines of code, and probably not that many added cycles considering we're going to do an index access of some sort. But I'm not sure if it's worth any extra effort --- I think in his example case, there wasn't any narrower index anyway. The real cost here isn't the number of columns, it's the size of the index, no? So shouldn't we look at relpages instead? For example, you'd certainly want to use an index on (field_we_care_about, smallint_field) over an index on (field_we_care_about, big_honking_text_field). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trailing comma support in SELECT statements
On 10/17/14, 11:19 PM, David E. Wheeler wrote: On Oct 17, 2014, at 3:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, exactly. Personally I'm *not* for this, but if we do it we should do it consistently: every comma-separated list in the SQL syntax should work the same. PL/pgSQL, too, I presume. Yes. The only case I can think of where we wouldn't want this is COPY. BTW, this should also apply to delimiters other than commas; for example, some geometry types use ; as a delimiter between points. I do think such a change should be made in stages, and maybe not every last one makes it into 9.5, but the intention should certainly be that we support extra delimiters *everywhere*. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Superuser connect during smart shutdown
But TBH I suspect 95% of the problems here would vanish if smart shutdown weren't the default ... But for your repeated objections, we would have changed the default to fast years ago. AFAICT everyone else is in favor of that. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/9/14, 4:19 PM, Andres Freund wrote: On 2014-10-09 18:16:46 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that*precisely* what a plain vacuum run does? Well, it also scans for dead tuples, removes them, and needs to go through indexes to remove their references. IIRC it doesn't do most of that if that there's no need. And if it's a insert only table without rollbacks. I*do* think there's some optimizations we could make in general. No, it always attempts dead tuple removal. The weird part is that if it's not doing a freeze it will just punt on a page if it can't get the cleanup lock. I have to believe that could seriously screw up autovacuum scheduling. Now that we have forks, I'm wondering if it would be best to come up with a per-page system that could be used to determine when a table needs background work to be done. The visibility map could serve a lot of this purpose, but I'm not sure if it would work for getting hint bits set in the background. I think it would also be a win if we had a way to advance relfrozenxid and relminmxid. Perhaps something that simply remembered the last XID that touched each page... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers