Re: [HACKERS] Missing optimization when filters are applied after window functions
On Thu, May 17, 2012 at 7:26 AM, Volker Grabsch wrote: > Hitoshi Harada schrieb: >> On Wed, May 16, 2012 at 12:50 AM, Volker Grabsch >> wrote: >> > I propose the following general optimization: If all window >> > functions are partitioned by the same first field (here: id), >> > then any filter on that field should be executed before >> > WindowAgg. So a query like this: >> >> I think that's possible. Currently the planner doesn't think any >> qualification from the upper query can be pushed down to a query that >> has a window function. It would be possible to let it push down if >> the expression matches PARTITION BY expression. > > Sounds great! > >> However, the >> challenge is that a query may have a number of window functions that >> have different PARTITION BY expressions. At the time of pushing down >> in the planning, it is not obvious which window function comes first. > > I'm don't really unterstand what you mean with "which window function > comes first", because to my understanding, all window functions of > a query belong to the same level in the query hierarchy. But then, > my knowledge of PostgreSQL internals isn't very deep, either. No, you can specify as many window specification as you like. Say, SELECT count(*) over (w1), count(*) over (w2) FROM tbl WINDOW w1 AS (PARTITION BY x ORDER BY y), w2 AS (PARTITION BY y ORDER BYx); and in the same query level there are different type of window specifications. The code as stands today doesn't have any semantics about which of w1 or w2 to run first (probably w1 will be run first, but the query semantics doesn't enforce anything). >> One idea is to restrict such optimization in only case of single >> window function, and the other is to make it generalize and cover a >> lot of cases. > > From a practical point of view, the restriction to a single window > function wouldn't be that bad, although I'd prefer to think about > the number of different windows rather than number of window functions. > > In other words, every optimization that is correct for a single window > function is also correct for multiple window functions if those use > all the same window. Yeah, I mean, multiple windows, not window functions. >> That said, our planner on window functions has a lot of improvement to >> be done. Every kind of optimization I see is what I raised above; >> they can be done easily by hacking in a small case, or they can be >> done by generalizing for the most of cases. My understanding is our >> project tends to like the latter and it takes a little time but covers >> more use cases. > > I'd also prefer to see a general solution, as this provides less > room for unpleasant surprises (e.g. "This query is only slightly > different from the previous one. Why does it take so much longer?"). > > On the other hand, any small improvement is a big step forward > regarding window functions. > > Unfortunately, I can't voluteer on that, as it is currently > impossible for me to allocate enough time for this. > > However, any pointer to where to look at the source (or in the > manual) would be of great. Maybe I'll find at least enough time > to provide a rough proposal, or to improve existing attempts > to solve this issue. Look at subquery_is_pushdown_safe in allpath.c. Here we stop looking deeper if the upper qualification can be pushed down to the inner sub-query if the sub-query has any window function expressions. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On the topic on fixing pg_dump to dump in a predictable order, can someone please update me on the current state of the problem? I've read though pg_dump_sort.c, and note objects are first sorted in type/name-based ordering, then topologically sorted in a way which "minimize unnecessary rearrangement". How come this not always generates a predictable order? Any ideas on how to fix the problem? If someone gives me a hint I might make an effort trying to implement the idea. If pg_dump would dump in a predictable order, it would make sense to dump all overloaded versions of functions sharing the same name in the same file. Then it would be _guaranteed_ two different databases committing their schema to a shared VCS commit exactly the same files if the schema is the same, which is not guaranteed unless the dump order is predictable. Having thought about it, I agree the idea with arguments in filenames is, probably possible, but suboptimal. Much better writing all overloaded functions to the same file and fixing the predictable dump order problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On Wed, May 23, 2012 at 9:09 AM, Tom Lane wrote: > Thus, not implausibly, causing the dump to fail entirely on some > filesystems. Case sensitivity, encoding issues, and special characters > in names (eg slashes or backslashes, depending on platform) are > additional pain points. This does not sound like a good plan from here. This is true, which means some users won't be able to use the feature, because they are using an ancient OS or have function names with slashes, hm, is it even possible to have function names with slashes? The maximum length of tables, functions etc in postgres is 63 characters. A function in postgres can have at most 100 arguments. The absolute majority of users run operating systems allowing at least 255 characters, http://en.wikipedia.org/wiki/Comparison_of_file_systems I suppose you have a lot more experience of what postgres installations exists in the world. Do you think it's common databases have non-ascii problematic characters in object names? Is it a project policy all features of all standard tools must be useful for all users on all platforms on all databases? Or is it acceptable if some features are only useable for, say, 90% of the users? > Taking a step or two back, it seems to me that the thrust of your > proposal is essentially to throw away all dump ordering information, > which does not seem like a particularly good idea either. It certainly > will not lead to a dump that can be restored reliably. If the use-case > for this is database comparisons, I think we'd be a lot better off to > write a postprocessing tool for regular dumps to perform such > comparisons, rather than whacking pg_dump around to the point where it's > unable to perform its primary function. Not at all, the ordering information is not thrown away, it is preserved in the dump file specified by the -f option, from which each split file is included using \i Example, this is an extract of the -f dump file in my database: -- -- PostgreSQL database dump -- SET statement_timeout = 0; SET client_encoding = 'SQL_ASCII'; SET standard_conforming_strings = off; SET check_function_bodies = false; SET client_min_messages = warning; SET escape_string_warning = off; -- -- Name: pgx_diag; Type: SCHEMA; Schema: -; Owner: postgres -- CREATE SCHEMA pgx_diag; ALTER SCHEMA pgx_diag OWNER TO postgres; -- ... some more schemas, languages etc ... -- ... and then all the included files: \i /home/postgres/database/gluepay-split/public/TYPE/dblink_pkey_results.sql \i /home/postgres/database/gluepay-split/public/TYPE/r_matchedwithdrawal.sql \i /home/postgres/database/gluepay-split/public/TYPE/r_unapprovedwithdrawal.sql \i /home/postgres/database/gluepay-split/public/TYPE/ukaccountvalidationchecktype.sql \i /home/postgres/database/gluepay-split/aml/FUNCTION/check_name.sql \i /home/postgres/database/gluepay-split/aml/FUNCTION/describe_entityid.sql \i /home/postgres/database/gluepay-split/aml/FUNCTION/get_linkid.sql \i /home/postgres/database/gluepay-split/aml/FUNCTION/set_address.sql -- ... all the objects .. \i /home/postgres/database/gluepay-split/public/FK_CONSTRAINT/workershistory_workerid_fkey.sql \i /home/postgres/database/gluepay-split/public/FK_CONSTRAINT/workershistory_workerstatusid_fkey.sql \i /home/postgres/database/gluepay-split/public/FK_CONSTRAINT/workershistory_workertypeid_fkey.sql -- .. and after all the included files comes permissions and stuff: -- -- Name: public; Type: ACL; Schema: -; Owner: postgres -- REVOKE ALL ON SCHEMA public FROM PUBLIC; REVOKE ALL ON SCHEMA public FROM postgres; GRANT ALL ON SCHEMA public TO postgres; GRANT ALL ON SCHEMA public TO PUBLIC; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Wed, May 23, 2012 at 01:38:06AM +0400, Alexander Korotkov wrote: > On Wed, May 23, 2012 at 1:26 AM, Bruce Momjian wrote: > > On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote: > > "Improve GiST box and point index performance by producing better trees > with > > less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas, > Kevin > > Grittner)" > > Is this note about following two commits? > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > > 7f3bd86843e5aad84585a57d3f6b80db3c609916 > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > > d50e1251946a6e59092f0a84fc903532eb599a4f > > These improvements influence not only boxes and points but all > geometrical > > datatypes. > > OK, new wording: > >Improve GiST geometric type index performance by producing better >trees with less memory allocation overhead (Alexander Korotkov) > > > Thanks! > > Also, I've some notes about removing reviewers. > "Improve GiST index build times (Alexander Korotkov)" > I think Heikki Linnakangas should be also listed as author of that patch > because he didn't only review and commit, but actually put his hands on code. OK, Heikki added. > Isn't my authorship of this patch lost now? > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > 80da9e68fdd70b796b3a7de3821589513596c0f7 > I think earlier this patch was taken into account in entry "Add support for > range data types". Probably, we need separate entry for this patch? I thought that was more of a Gist index improvement than a range type improvement, but I have added your name to the range type item. Thanks. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
Joel Jacobson writes: > If the entire function identity arguments would be included in the filename, > two dumps of the same schema in two different databases > would be guaranteed to produce the same dump. > This would render some very long filenames for functions with many arguments, Thus, not implausibly, causing the dump to fail entirely on some filesystems. Case sensitivity, encoding issues, and special characters in names (eg slashes or backslashes, depending on platform) are additional pain points. This does not sound like a good plan from here. Taking a step or two back, it seems to me that the thrust of your proposal is essentially to throw away all dump ordering information, which does not seem like a particularly good idea either. It certainly will not lead to a dump that can be restored reliably. If the use-case for this is database comparisons, I think we'd be a lot better off to write a postprocessing tool for regular dumps to perform such comparisons, rather than whacking pg_dump around to the point where it's unable to perform its primary function. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse
> This seems like a rather expensive solution to a problem that I'm not > really convinced is real. Why should a client program care about the > severity level to a greater extent than whether the message is > ErrorResponse or NoticeResponse? In particular, I'm entirely > unconvinced by your claim that pgpool ought to treat ERROR and FATAL > cases differently.Whatever it does about session termination ought to > be driven by the connection closure, not by the content of the message. > (As a counterexample, what if the backend crashes without delivering any > message at all?) Pgpool already detects session termination and handles it. BTW, pgpool uses blocking socket and issue select(2), watching read/exception of the file descriptor before reading from the socket. It seems even if the backend teminates the session, select() does not respond immediately but wait forever. I thought in this case select() returns and reading the socket returns EOF. So it seems sometimes it's hard to know if the socket was closed by backend. This is one of the reasons why I don't want entirely rely on the physical session termination event. Moreover what pgpool would like to do here is, better service to users because pgpool is not just a proxy server to PostgreSQL. For example pgpool does failover if one of PostgreSQL nodes goes down. Pgpool can detect postmaster's planning shutdown by watching healthiess of PostgreSQL but it is much better to trigger it by detecting PostgreSQL's admin shutdown case, because it is quicker (less cluster down time) and is more reliable (health check needs to retry before judging postmaster down because of temporary network error or some such). > Moreover, if we did add this starting in 9.3, it would > still be many years before clients could rely on it being provided, > which means you'd need some other solution anyway. Pgpool relies on PostgreSQL and cannot provide services more than what PostgreSQL does. User can do more if he/she uses newer versin of PostgreSQL. This can be said to not only this particlular problem but any other things. > Another issue is that if we do this, we're essentially (IMO) promising > that the set of severity codes won't change in the future, which may > not be a good thing to promise. Why do you care? The list of severity is cleary stated in the document and you cannot arbitality change it without major version change anyway. There's nothing different thing here from any other features what PostgreSQL explicitly provides. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On Wed, May 23, 2012 at 3:24 AM, Daniel Farina wrote: > Is there a reason why the current directory format could not be > adjusted to become more human-readable friendly for mechanical > reasons? I realize there is a backwards compatibility problem, but it > may be better than bloating a new option. I like your idea, then the format would be directory, while the option would be something like, --human-friendly? Currently, the directory format only dumps the data of tables into separate files. Everything else goes into the toc.dat file. To make it work, also the stuff written to the toc.dat file must be written to separate files. > But I don't see how making the directory output format more > human-friendly could be seen as a bad thing overall, except in the > notable axis of implementation complexity. Silly issues like naming > files on different platforms, case sensitivity, and file length > restrictions may rear their ugly head. If the entire function identity arguments would be included in the filename, two dumps of the same schema in two different databases would be guaranteed to produce the same dump. This would render some very long filenames for functions with many arguments, but this problem could at least be reduced by using the shorter aliases for each data type, as "varchar" instead of "character varying" and "timestamptz" instead of "timestamp with time zone", etc. http://www.postgresql.org/docs/devel/static/datatype.html#DATATYPE-TABLE Also, to get even more space, as the name of the function can be long too, the function name could be made a directory, and the different overloaded types different files, e.g: /public/FUNCTION/myfunc/int.sql /public/FUNCTION/myfunc/int_timestamptz.sql And functions with no arguments are written to a single file (suffic .sql to avoid conflict with eventual directory name for function): /public/FUNCTION/myfunc.sql > I think about this because in addition to the data types and operators > defined in the development process, there are often small tables that > need to be loaded with content and version controlled as well, rather > like userland-equivalents pg_enum entries. Is there a term for such tables? I use the term "lookup tables", but perhaps there is a better one? In my schema, they typically maps statusids, stateids, etc to human friendly names. E.g., if Orders is a huge table for all orders, I might have a OrderStatuses table to lookup all the OrderStatusID columns in Orders. Orders.OrderStatusID -fk-> OrderStatuses.OrderStatusID OrderStatuses.Name is unqiue and contains the human friendly name of the status. These small lookup tables also needs to be version controlled of course. This is a tricky one though, because you might have small tables with base data, but with references to other huge tables, which you don't want to include in your automatically version controlled schema dump. I solved this problem by creating a quite complex recursive plpgsql function, resolving all dependencies and joining only the rows from each table required, allowing you to specify a regex matching a list of tables, which in turn resolves to all tables they have references to, and dumps these tables too, but only the required rows. The result is a dump of each such table into a separate file, in a restorable order not causing any dependency problems. Then I have a similar function to do the restoring. I use this approach to build a restorable clean test database of any version of the system, may it be the production or some developer's local version of it. And also, not to forget, to make it work all the sequences also needs to be restarted to the same values as in the original database after the dump is restored. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse
Tatsuo Ishii writes: > I described the problem with possibly localized "S" filed of > ErrorResponse(and NoticeResponse) in Frontend/Backend protocol. > http://archives.postgresql.org/pgsql-hackers/2012-05/msg00912.php > So I would like to propose a solution for this (for 9.3): add new > field to ErrorResponse and NoticeResponse. The new field will have the > same value as "S" except that it's never localized. This will not only > solve the problem I described but possibly reduce the cost to analyze > the error/notice messages in the frontend programs. This seems like a rather expensive solution to a problem that I'm not really convinced is real. Why should a client program care about the severity level to a greater extent than whether the message is ErrorResponse or NoticeResponse? In particular, I'm entirely unconvinced by your claim that pgpool ought to treat ERROR and FATAL cases differently. Whatever it does about session termination ought to be driven by the connection closure, not by the content of the message. (As a counterexample, what if the backend crashes without delivering any message at all?) Moreover, if we did add this starting in 9.3, it would still be many years before clients could rely on it being provided, which means you'd need some other solution anyway. Another issue is that if we do this, we're essentially (IMO) promising that the set of severity codes won't change in the future, which may not be a good thing to promise. > BTW, changing existing "S" field not to be localized would work but > I'm afraid it breaks backward compatibility. We made it localized intentionally, on the grounds that its principal and probably sole use was for presentation to human users. I've not heard previous complaints about that decision. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On Tue, May 22, 2012 at 4:35 PM, Stephen Frost wrote: > * Josh Berkus (j...@agliodbs.com) wrote: >> The local role is preferred, the same way we allow objects in the local >> schema to overshadow objects in the global schema. > > I would think we'd want the exact opposite. I don't want my global > 'postgres' user to be overwritten by some local one that the admin of > this particular DB created.. In object-orientedness, the usual behaviour is for more specific methods to override the more generic ones, which is reasonable. I'm not certain which direction is more to be preferred, whether: a) To consider the global user as a default, to be overridden if possible, or b) To consider the local user as the default, to be overridden if possible. They're both tenable positions. But I think I agree with Stephen, that what's desirable, with global users, is to use them as the override. They're gonna be expensive, you should get something for the price :-). -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse
I described the problem with possibly localized "S" filed of ErrorResponse(and NoticeResponse) in Frontend/Backend protocol. http://archives.postgresql.org/pgsql-hackers/2012-05/msg00912.php So I would like to propose a solution for this (for 9.3): add new field to ErrorResponse and NoticeResponse. The new field will have the same value as "S" except that it's never localized. This will not only solve the problem I described but possibly reduce the cost to analyze the error/notice messages in the frontend programs. Adding new field can be possible without breaking current version of Frontend/Backend protocol since the protocol is extensible in this area: "Since more field types might be added in future, frontends should silently ignore fields of unrecognized type."(from "46.5. Message Formats" of PostgreSQL 9.2 documentation) BTW, changing existing "S" field not to be localized would work but I'm afraid it breaks backward compatibility. Comments? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
> That's only true if you try to satisfy both goals at once, which I'm > not suggesting. So I believe that proposition to be false. Oh, ok. Per your original email and follow-up arguments, you seemed to be doing just that. >> An alternative idea -- and one which could be deployed a lot faster -- >> is to come up with a tool which makes it easy to migrate an entire >> database into a separate schema or set of schemas in an existing >> database. And improvements to manage schema visility/path better, I >> suppose. > > Yes, it is possible to improve things there also. Feh, and nested schema, for that matter. So, there's a fair bit of work wherever we bite it off. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
> The issue with not allowing global spaces to overlap local ones is that > we'd have to check every local list when creating a global account; > that doesn't seem very easy to do. On the flip side, allowing > duplicates between global and local would remove the need to check local > lists when creating global accounts, but would add complexity and could > lead to odd semantics when there is a duplicate. On the other hand, keep in mind that creating a global account can be slow. For anyone who has a huge multi-tenant setup with 200 database each with their own local users, creating a new global account will be an event which occurs once or twice a year. Just so that we don't pay the same check cost for people who don't use local accounts. > If you could help me work out the semantics and the high-level issues, > I'd love to spend time on this for 9.3... Syntax seems simple: CREATE LOCAL ROLE ... For that matter, let's keep other things simple: 1. local roles can inherit only from other local roles 2. global roles can inherit only from other global roles 3. only a global role can be a database owner -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Wed, May 23, 2012 at 1:26 AM, Bruce Momjian wrote: > On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote: > > "Improve GiST box and point index performance by producing better trees > with > > less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas, > Kevin > > Grittner)" > > Is this note about following two commits? > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > > 7f3bd86843e5aad84585a57d3f6b80db3c609916 > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > > d50e1251946a6e59092f0a84fc903532eb599a4f > > These improvements influence not only boxes and points but all > geometrical > > datatypes. > > OK, new wording: > >Improve GiST geometric type index performance by producing better >trees with less memory allocation overhead (Alexander Korotkov) Thanks! Also, I've some notes about removing reviewers. "Improve GiST index build times (Alexander Korotkov)" I think Heikki Linnakangas should be also listed as author of that patch because he didn't only review and commit, but actually put his hands on code. Isn't my authorship of this patch lost now? http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=80da9e68fdd70b796b3a7de3821589513596c0f7 I think earlier this patch was taken into account in entry "Add support for range data types". Probably, we need separate entry for this patch? -- With best regards, Alexander Korotkov.
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
On 22 May 2012 18:41, Tom Lane wrote: > It'd be better to work on index-organized tables My earlier analysis showed that IOTs are essentially the same thing as block-level indexes, referred to as GITs by Heikki. (Robert referred to these as Lossy Indexes recently, which was not the case - that aspect was exactly the reason for rejection previously, so we should not retread that path - indexes can operate at block level without being lossy). The number of index pointers is identical in each case, so IOTs are not any more efficient in terms of space usage or I/O. IOTs are much more difficult to implement, so I can't see any reason to work on them. For example, having heap rows migrate on a block split will cause havoc with our index implementation. We haven't worked out how to re-join blocks that have split while maintaining concurrency, so IOTs would require some pretty drastic repacking with a severe lock type. Please lets avoid IOTs. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Wed, May 16, 2012 at 10:49:25PM +0300, Heikki Linnakangas wrote: > On 16.05.2012 22:38, Jeff Janes wrote: > >For item: > >Improve COPY performance by adding tuples to the heap in batches > >(Heikki Linnakangas) > > > >I think we should point out that the batching only applies for COPY > >into unindexed tables. Nice as the feature is, that is pretty big > >limitation not to mention. > > No, it applies to indexed tables too. However, if there are indexes > on the table, the overhead of updating the indexes will probably > make any speedup in the heap insertions look tiny in comparison. > > The optimization doesn't apply if the table has BEFORE or INSTEAD OF > triggers, or volatile DEFAULT expressions need to be evaluated. So, I will assume the existing wording is fine. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote: > "Improve GiST box and point index performance by producing better trees with > less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas, Kevin > Grittner)" > Is this note about following two commits? > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > 7f3bd86843e5aad84585a57d3f6b80db3c609916 > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h= > d50e1251946a6e59092f0a84fc903532eb599a4f > These improvements influence not only boxes and points but all geometrical > datatypes. OK, new wording: Improve GiST geometric type index performance by producing better trees with less memory allocation overhead (Alexander Korotkov) -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
On 22 May 2012 19:01, Robert Haas wrote: > On Tue, May 22, 2012 at 1:41 PM, Tom Lane wrote: >> Jeff Janes writes: >>> Now that there are index only scans, there is a use case for having a >>> composite index which has the primary key or a unique key as the >>> prefix column(s) but with extra columns after that. Currently you >>> would also need another index with exactly the primary/unique key, >>> which seems like a waste of storage and maintenance. >> >>> Should there be a way to declare a "unique" index with the unique >>> property applying to a prefix of the indexed columns/expression? And >>> having that, a way to turn that prefix into a primary key constraint? >> >>> Of course this is easier said then done, but is there some reason for >>> it not to be a to-do item? >> >> Um ... other than it being ugly as sin? I can't say that I can get >> excited about this concept. It'd be better to work on index-organized >> tables, which is really more or less what you're wishing for here. >> Duplicating most of a table into an index is always going to be a loser >> in the end because of the redundant storage. > > An index on pgbench_accounts (aid, abalance) is the same size as an > index on pgbench_accounts (aid), but even if it were larger, there's > no theoretical reason it couldn't have enough utility to justify its > existence. Agreed > A bigger problem is that creating such an index turns all > of pgbench's write traffic from HOT updates into non-HOT updates, > which means this is probably only going to be a win if the write > volume is miniscule. Not sure whether you see that as an argument against the proposal. This argument applies to any index. In particular covered indexes are specifically encouraged by index only scans, so is not a reason to avoid implementing the feature as Jeff describes. The main reason for the feature as described by Jeff is that it avoids having 2 indexes when only one is required. In 9.2, with index only scans, Jeff showed elsewhere that we can get an amazing speed up by having a covered index. However, what Jeff is noticing is that he needs 2 indexes on the table: 1 PK on (aid) and another index on (aid, abalance). The first index can be avoided altogether, allowing a good improvement in cache efficiency and general performance. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On May22, 2012, at 22:35 , Stephen Frost wrote: > * Josh Berkus (j...@agliodbs.com) wrote: >> The local role is preferred, the same way we allow objects in the local >> schema to overshadow objects in the global schema. > > I would think we'd want the exact opposite. I don't want my global > 'postgres' user to be overwritten by some local one that the admin of > this particular DB created.. From a security POV, yup, you'd want global roles to take precedence. But OTOH, you wouldn't want your application to suddenly break because someone created a global role which shadows the local role you've created a year ago. Hm… thinking about this… No matter which takes precedence, we'd need some way to explicitly specify global or local scope anyway. And we'd have to prevent roles from being named in a way that conflicts with whatever explicit specification we come up with, even if that causes pain for some unlucky existing users. Avoiding these conflicts entirely is going to be impossible, I fear, since we don't currently restrict role names in any way AFAIK, and we store them in GUCs without any quoting. So maybe we should just pick some qualification like prefixing local roles with 'local.', forbid global roles from starting with 'local.', and be done with it? Not the most elegant solution maybe, but it avoids surprises... > The issue with not allowing global spaces to overlap local ones is that > we'd have to check every local list when creating a global account; > that doesn't seem very easy to do. "Not very easy" is quite an understatement, I fear. "Very nearly impossible" is more like it IMHO. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Wed, May 16, 2012 at 05:30:27PM -0400, Bruce Momjian wrote: > > I will make the adjustments outlined below as soon as I can. Done and committed. --- > > On Sun, May 13, 2012 at 12:37:52AM -0400, Robert Haas wrote: > > On Sat, May 12, 2012 at 8:11 PM, Euler Taveira wrote: > > > On 12-05-2012 10:27, Bruce Momjian wrote: > > >> How many names on a single item is ideal? The activity of reviewers and > > >> their names on commit messages has greatly expanded the number of > > >> potential names per item. > > >> > > > Main authors only. Reviewers should be mentioned only in the commit log. > > > If I > > > coded a feature and Bruce got the idea worked in another patch (that is > > > better > > > then mine), I think only Bruce should be credited in release notes (but I > > > could be mentioned in the commit log as the feature designer). However, > > > if I > > > posted a patch and Robert improved that patch using only 30% of my work, I > > > should be credited (as coauthor) because he used a considerable part of > > > my work. > > > > Completely agreed. If we're going to include names in the release > > notes, I agree that this is the way to do it, and I think it's what we > > have done in prior releases. > > > > I tend to err on the side of crediting people in the commit message > > (of course, occasionally I forget someone who should have been > > included), but I also try to make it clear by the phrasing whose code > > got included and who contributed in some other way - e.g. by reporting > > the problem, coming up with the original idea, or reviewing. I do > > this in part because I assumed that we'd use that as the criteria for > > including names in the release notes, as we have done in prior > > releases. So if I write: > > > > Euler Taveira, reviewed by Bruce Momjian, substantially rewritten by me > > > > ...then I expect that to turn up in the release notes as (Euler > > Taveira, Robert Haas). If I write: > > > > Euler Taveira, reviewed by Bruce Momjian, with minor cleanup by me > > > > ...then I expect that to turn up as (Euler Taveira). And if I write > > something like: > > > > Inspired by a patch from Euler Taveira. Review (in earlier versions) > > by Bruce Momjian. > > > > ...then I expect that to turn up as (Robert Haas) or (Robert Haas, > > Euler Taveira). > > > > In doubtful cases, I think it's generally appropriate to err on the > > side of crediting the person who was the original driving force behind > > the patch, and also to err on the side of not crediting the committer. > > But if the committer chopped up the patch and committed something > > significantly different from the original, then they should be > > credited - or blamed - for the result. > > > > -- > > Robert Haas > > EnterpriseDB: http://www.enterprisedb.com > > The Enterprise PostgreSQL Company > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22 May 2012 18:56, Josh Berkus wrote: > I'm not arguing that we don't have users who would like interdatabase > queries, especially when they port applications from MySQL or MSSQL. We > have a lot of such users. Lots and lots, yes. > However, we *also* have a lot of users who > would like to treat separate databases as virtual private instances of > Postgres, and there's no way to satisfy both goals. We have to choose > one route or the other. That's only true if you try to satisfy both goals at once, which I'm not suggesting. So I believe that proposition to be false. However, given sufficient people speaking against it, I'll leave this idea. Though I'd suggest that people on this thread spend a little quality time with FDWs. It's a great direction but there's a long way to go yet. Sorry to Laurenz, who's done a great job so far on the Oracle FDW. > I personally see the isolation case as the more necessary because there > are several workarounds for the "inter-database queries" issue I also want that, per my original post. > An alternative idea -- and one which could be deployed a lot faster -- > is to come up with a tool which makes it easy to migrate an entire > database into a separate schema or set of schemas in an existing > database. And improvements to manage schema visility/path better, I > suppose. Yes, it is possible to improve things there also. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 22, 2012 at 8:28 PM, Robert Haas wrote: > Totally agreed. We're not the first people to think of this, either: > CLOCK and GLOCK have been extensively studied and found to be almost > as good as LRU in selecting good victim pages, but with less > contention. That's why people are using them. > > Here's a paper that defines GLOCK to be the algorithm we use (page 2, > second column, second paragraph from the bottom), and furthermore > mentions PostgreSQL (top of page 3): > > http://staff.aist.go.jp/m.yui/publications/ICDE10_conf_full_409.pdf Interesting paper, they make the whole buffer management lock free. Getting rid of not only the BufFreelistLock but also BufMappingLocks and buffer header spinlocks. Now AFAIK BufMappingLocks and buffer header aren't really a big issue for us, and atleast the latter looks like turning it lock-free would entail lots of pretty hairy and bug-prone code. On the other hand, making the clock sweep lock-free looks relatively easy. As far as I can see there is no reason why nextVictimBuffer couldn't be read, incremented and atomically cmpxchg'ed to let other continue before checking for the usage count. Likewise completePasses and numBufferAllocs can easily be atomically incremented, bgwriter shouldn't mind if the values are slightly out of date. The free list itself is a bit trickier, but if it's still necessary/useful then SC->firstFreeBuffer and buf->freeNext are in effect a linked-list stack, there should plenty of tested lock free algorithms floating around for that. (btw. lastFreeBuffer looks like dead code, is that correct?) As for better buffer management algorithms, have you read about CLOCK-Pro? [1] It looks like it's an adaptive variant of LIRS, the algorithm the MySQL uses (or atleast used some time ago). Looks like Linux kernel devs also at least thought about implementing it [2] [3] (hard to tell exactly, their docs are pretty chaotic compared pg). According to [4], LIRS is almost as good as or better than LRU, by extension I'd expect CLOCK-Pro to be better than GLOCK. It still has a single clock mechanism, so better replacement policies won't help a whole lot with lock contention on buffer allocation. [1] http://www.cse.ohio-state.edu/~fchen/paper/papers/usenix05.pdf [2] http://linux-mm.org/ClockProApproximation [3] http://linux-mm.org/PageReplacementDesign [4] http://www.almaden.ibm.com/cs/people/dmodha/ARC.pdf Cheers, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
* Josh Berkus (j...@agliodbs.com) wrote: > The local role is preferred, the same way we allow objects in the local > schema to overshadow objects in the global schema. I would think we'd want the exact opposite. I don't want my global 'postgres' user to be overwritten by some local one that the admin of this particular DB created.. > The feature wouldn't be useful if we didn't allow conflicts between two > local role names. However, we could prohibit conflicts between a local > role name and a global role name if it made the feature considerably > easier. Users would find workarounds which weren't too arduous. Sorry, I was meaning between global space and local space. Clearly we must allow and handle cleanly overlaps between local spaces. The issue with not allowing global spaces to overlap local ones is that we'd have to check every local list when creating a global account; that doesn't seem very easy to do. On the flip side, allowing duplicates between global and local would remove the need to check local lists when creating global accounts, but would add complexity and could lead to odd semantics when there is a duplicate. > 1. create a new local role > 2. reassign all the objects belonging to the global role to the local role > 3. drop the global role > 4. rename the local role Right, that seems like it would work fine. > It'd be somewhat of a PITA, but I suspect that most people using the > "local roles" feature would recreate their databases from scratch > anyway. And we could offer some sample scripts for the above on the > wiki and elsewhere. Obviously, a more elegant migration command would > be ideal, but that could wait for the following PG release; we usually > follow the "make things possible first, and easy later" plan anyway. Sure. > Given that I'd love to have this feature, I'm trying to pare down its > requirements to a managable size. Trying to do everything at once will > only result in the feature stalling until 10.5. If you could help me work out the semantics and the high-level issues, I'd love to spend time on this for 9.3... Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Schema version management
On Mon, May 21, 2012 at 5:25 PM, Joel Jacobson wrote: > If one want to reuse the splitting to files-code of the directory > format, maybe the existing option -F d could be tweaked to output in > both a a machine-readable format (current way), and also a > human-friendly tree of files and content (like suggested by my patch). > > I wonder what the option would be called then, having two chars > options is not an option I guess, maybe -F t for "tree" instead of > "directory", as the -F d option only dumps to a single directory and > not a tree-structure? Is there a reason why the current directory format could not be adjusted to become more human-readable friendly for mechanical reasons? I realize there is a backwards compatibility problem, but it may be better than bloating a new option. Andrew's approach of reading the TOC also be good...as so pg_dump can avoid serving the additional master of schema versioning and development usability in addition to dumping. The TOC is the closest thing we have to the library-ification of pg_dump in the near-term. But I don't see how making the directory output format more human-friendly could be seen as a bad thing overall, except in the notable axis of implementation complexity. Silly issues like naming files on different platforms, case sensitivity, and file length restrictions may rear their ugly head. I think about this because in addition to the data types and operators defined in the development process, there are often small tables that need to be loaded with content and version controlled as well, rather like userland-equivalents pg_enum entries. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Tue, May 22, 2012 at 10:56 AM, Josh Berkus wrote: > I'm not arguing that we don't have users who would like interdatabase > queries, especially when they port applications from MySQL or MSSQL. We > have a lot of such users. However, we *also* have a lot of users who > would like to treat separate databases as virtual private instances of > Postgres, and there's no way to satisfy both goals. We have to choose > one route or the other. I think the idea that a physical machine where catalogs are physically (shared-everything) co-located is one that will not stand for long as part of a useful contract between a user and the database. I'd really like to avoid an extra tier of functionality that exists only for databases that happen to land on the same physical machine. I think any inter-database feature should work identically between two databases across a network as two machines on one machine/cluster. Transparent optimizations to deal with the special case of physical co-location are not contrary to that contract, but I don't have a sense of how important those optimizations would be before getting a lot of the usability issues figured out. Right now, it seems to me that getting interdatabase usability feeling better is already pretty hard. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
On Tue, May 22, 2012 at 1:36 PM, Simon Riggs wrote: > On 22 May 2012 18:24, Jeff Janes wrote: > > Now that there are index only scans, there is a use case for having a > > composite index which has the primary key or a unique key as the > > prefix column(s) but with extra columns after that. Currently you > > would also need another index with exactly the primary/unique key, > > which seems like a waste of storage and maintenance. > > > > Should there be a way to declare a "unique" index with the unique > > property applying to a prefix of the indexed columns/expression? And > > having that, a way to turn that prefix into a primary key constraint? > > > > Of course this is easier said then done, but is there some reason for > > it not to be a to-do item? > > +1 > > Very useful > > Semi-private note to Simon: isn't this pretty much what I was advocating at the London meetup last month?
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
On Tue, May 22, 2012 at 10:41 AM, Tom Lane wrote: > Jeff Janes writes: >> Now that there are index only scans, there is a use case for having a >> composite index which has the primary key or a unique key as the >> prefix column(s) but with extra columns after that. Currently you >> would also need another index with exactly the primary/unique key, >> which seems like a waste of storage and maintenance. > >> Should there be a way to declare a "unique" index with the unique >> property applying to a prefix of the indexed columns/expression? And >> having that, a way to turn that prefix into a primary key constraint? > >> Of course this is easier said then done, but is there some reason for >> it not to be a to-do item? > > Um ... other than it being ugly as sin? I can't say that I can get > excited about this concept. It'd be better to work on index-organized > tables, which is really more or less what you're wishing for here. IOT would probably be a nice feature too, but adding one more strategically chosen column to an index is quite different from adding every column into the index. At least in the general case. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: parameterized function scan
On Fri, May 11, 2012 at 5:52 PM, Antonin Houska wrote: > Hello, > following this short discussion > http://archives.postgresql.org/message-id/4f5aa202.9020...@gmail.com > I gave it one more try and hacked the optimizer so that function can become > an inner relation in NL join, parametrized with values from the outer > relation. > > I tried to explain my thoughts in comments. Other than that: > > 1. I haven't tried to use SpecialJoinInfo to constrain join order. Even if > the order matters in query like > SELECT * from a, func(a.i) > it's still inner join by nature. SpecialJoinInfo is used for INNER join > rarely, but never stored in PlannerInfo. Doing so only for these lateral > functions would be rather disruptive. > > 2. Simple SQL function (i.e. one that gets pulled-up into the main query) is > a special case. The query that results from such a pull-up no longer > contains any function (and thus is not affected by this patch) but such > cases need to be newly taken into account and examined / tested (the patch > unblocks them at parsing stage too). > > 3. There might be some open questions about SQL conformance. > > I've spent quite a while looking into the optimizer code and after all I was > surprised that it didn't require that many changes. At least to make few > simple examples work. Do I ignore any important fact(s) ? This implementation looks different than I'd expect: I would have thought that it would work by generating paths with param_info set to the appropriate set of rels to provide the necessary values, rather than inventing its own mechanism for forcing a nestloop. Also, I think we will want something that implements the LATERAL() syntax, rather than just removing the prohibition on lateral references. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exclusion Constraints on Arrays?
On May 22, 2012, at 9:56 AM, Tom Lane wrote: > It seems probably workable given that we expect the pending list to be > of fairly constrained size. However, the commit message referenced > upthread also muttered darkly about GIN's partial match logic not working > in amgettuple. I do not recall the details of that issue, but unless we > can solve that one too, there's not much use in fixing this one. Well, what about a GiST operator family/class for arrays? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
On Tue, May 22, 2012 at 1:41 PM, Tom Lane wrote: > Jeff Janes writes: >> Now that there are index only scans, there is a use case for having a >> composite index which has the primary key or a unique key as the >> prefix column(s) but with extra columns after that. Currently you >> would also need another index with exactly the primary/unique key, >> which seems like a waste of storage and maintenance. > >> Should there be a way to declare a "unique" index with the unique >> property applying to a prefix of the indexed columns/expression? And >> having that, a way to turn that prefix into a primary key constraint? > >> Of course this is easier said then done, but is there some reason for >> it not to be a to-do item? > > Um ... other than it being ugly as sin? I can't say that I can get > excited about this concept. It'd be better to work on index-organized > tables, which is really more or less what you're wishing for here. > Duplicating most of a table into an index is always going to be a loser > in the end because of the redundant storage. An index on pgbench_accounts (aid, abalance) is the same size as an index on pgbench_accounts (aid), but even if it were larger, there's no theoretical reason it couldn't have enough utility to justify its existence. A bigger problem is that creating such an index turns all of pgbench's write traffic from HOT updates into non-HOT updates, which means this is probably only going to be a win if the write volume is miniscule. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
> Why is it OK to allow somebody to access multiple schema in one query, > but not multiple databases? Are you arguing that schemas are also > broken? Because the purpose of a database is to be a Catalog, i.e. an isolated container, which is not the purpose of schemas. To the extent to which we can make the isolation a reality (instead of "lossy isolation" the way it is now), we can enable many multitenant hosting designs which aren't currently possible. However, pursuing interdatabase queries at the same time we try to add new isolation features is a doomed effort. For example, if we created local database users (see other thread), then what would happen if a local user tries to execute a cross-database query? If we enable physical migration of a database to a new Postgres instance, what happens to standing multi-database views? If interdatabase queries are allowed, how do I, as a hosting operator, make sure that users can't even see the other databases on the system? > I see no failure by design. I see an idea for greater ease of use > being discussed. You can't attempt mutually contradictory requirements and expect to succeed, or to improve ease of use. You can't ride two horses, especially if they're going in opposite directions. > Personally, I have long recommended that people use schemas. But > people do use databases and when they do they are pretty much screwed. > I brought this up as a way of improving our ease of use. I'm not arguing that we don't have users who would like interdatabase queries, especially when they port applications from MySQL or MSSQL. We have a lot of such users. However, we *also* have a lot of users who would like to treat separate databases as virtual private instances of Postgres, and there's no way to satisfy both goals. We have to choose one route or the other. I personally see the isolation case as the more necessary because there are several workarounds for the "inter-database queries" issue, but nothing for the "multitenant catalog" case. Also, treating databases as catalogs is more consistent with our historical approach, and will thus break less backwards compatibility. Maybe interdatabase queries are more useful/desired than catalog features. If that's the case, then we need to pursue them and abandon efforts like per-database users. But we have to make a choice. An alternative idea -- and one which could be deployed a lot faster -- is to come up with a tool which makes it easy to migrate an entire database into a separate schema or set of schemas in an existing database. And improvements to manage schema visility/path better, I suppose. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with error response message
On 22 May 2012 12:59, Tatsuo Ishii wrote: > Also I wonder why "conflict with recovery" generates same error code > as serialization error. For me it seems not very consistent > choice. IMO, we should assign different error code for at least brings > totally different effect to frontend: aborting transactions or > sessions. Some conflicts cause ERROR, some cause FATAL. So there is not a distinct difference. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Tue, May 22, 2012 at 10:43 AM, Simon Riggs wrote: > On 22 May 2012 18:35, Josh Berkus wrote: >> >>> If I have a customer with 1 database per user, how do they run a query >>> against 100 user tables? It would require 100 connections to the >>> database. Doing that would require roughly x100 the planning time and >>> x100 the connection delay. Larger SQL statements pass their results >>> between executor steps using libpq rather than direct calls. >> >> Why is this hypothetical customer using separate databases? This really >> seems like a case of "doctor, it hurts when I do this". > > Databases are great for separating things, but sometimes you want to > un-separate them in a practical way. In my experience, these un-separations are (thankfully) relieved of the requirement of consistency between databases, and so the contract is much more favorable. The planning time problem is quite hard. However, I think the connection-delay one is not as hard a one to answer: I think multiplexed protocols are going to become the norm in the near future (they have been a pretty uncontested part of the SPDY protocol, for example, after flow control was added) and have a number of useful properties, and it may be time to consider how we're going to divorce the notion of one socket implies exactly one backend. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Tue, May 22, 2012 at 1:27 PM, Simon Riggs wrote: > Ack, part from the bit about OIDs no longer being unique. That might > be an upgrade issue but its obviously something we wouldn't allow if > we did that. And how exactly are you going to disallow that? We currently enforce the uniqueness of OIDs within a database by putting a unique index on the relevant system catalog columns, but that clearly won't work as a means of guaranteeing cross-database uniqueness. Unless of course you put all the catalog entries from all the databases into a single set of catalogs; but then they're pretty much the same as the schemas we already have. > I'm not sure I see changing the caches as being massive surgery. > Perhaps we could just bypass them altogether. You're entitled to your opinion on this one, but we have those caches for a very good reason: the system runs about 20,000 times slower without them. >> So I'd rather see us put the effort into pgsql_fdw, which, as Florian >> says, will also let us talk to a completely separate server. If >> you've got multiple databases in the same cluster and really need to >> be doing queries across all of them, that's what schemas are >> supposedly for. Now I know that people feel that doesn't work as well >> as it needs to, but there again I think it would be easier to fix >> schemas than to make cross-database queries work. > > We're a very long way from making that work well. IMHO easily much > further than direct access. > > If I have a customer with 1 database per user, how do they run a query > against 100 user tables? It would require 100 connections to the > database. Doing that would require roughly x100 the planning time and > x100 the connection delay. Larger SQL statements pass their results > between executor steps using libpq rather than direct calls. > > I find it hard to believe FDWs will ever work sufficiently well to fix > those problems. If you have got a customer with one database per user, and yet you want to run queries across all those tables, then you're using the database system for something for which it is not designed, and it's probably not going to work very well. That sounds like a use case for schemas, or maybe better, some kind of row-level security we don't have yet - but hopefully will in 9.3, since KaiGai intends to work on it. Databases are designed to act as a firewall, so that somebody using one database isn't affected by what happen in some other database. Unfortunately, because of shared catalogs, that's not completely true, but that's the idea, and if anything we need to isolate them more, not less. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22 May 2012 18:35, Josh Berkus wrote: > >> If I have a customer with 1 database per user, how do they run a query >> against 100 user tables? It would require 100 connections to the >> database. Doing that would require roughly x100 the planning time and >> x100 the connection delay. Larger SQL statements pass their results >> between executor steps using libpq rather than direct calls. > > Why is this hypothetical customer using separate databases? This really > seems like a case of "doctor, it hurts when I do this". Databases are great for separating things, but sometimes you want to un-separate them in a practical way. I'm surprised that you're so negative about an ease of use feature. I had understood you cared about fixing problems experienced by our developers. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
Jeff Janes writes: > Now that there are index only scans, there is a use case for having a > composite index which has the primary key or a unique key as the > prefix column(s) but with extra columns after that. Currently you > would also need another index with exactly the primary/unique key, > which seems like a waste of storage and maintenance. > Should there be a way to declare a "unique" index with the unique > property applying to a prefix of the indexed columns/expression? And > having that, a way to turn that prefix into a primary key constraint? > Of course this is easier said then done, but is there some reason for > it not to be a to-do item? Um ... other than it being ugly as sin? I can't say that I can get excited about this concept. It'd be better to work on index-organized tables, which is really more or less what you're wishing for here. Duplicating most of a table into an index is always going to be a loser in the end because of the redundant storage. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index
On 22 May 2012 18:24, Jeff Janes wrote: > Now that there are index only scans, there is a use case for having a > composite index which has the primary key or a unique key as the > prefix column(s) but with extra columns after that. Currently you > would also need another index with exactly the primary/unique key, > which seems like a waste of storage and maintenance. > > Should there be a way to declare a "unique" index with the unique > property applying to a prefix of the indexed columns/expression? And > having that, a way to turn that prefix into a primary key constraint? > > Of course this is easier said then done, but is there some reason for > it not to be a to-do item? +1 Very useful -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
> If I have a customer with 1 database per user, how do they run a query > against 100 user tables? It would require 100 connections to the > database. Doing that would require roughly x100 the planning time and > x100 the connection delay. Larger SQL statements pass their results > between executor steps using libpq rather than direct calls. Why is this hypothetical customer using separate databases? This really seems like a case of "doctor, it hurts when I do this". -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22 May 2012 18:18, Josh Berkus wrote: > >> 1. Ability to have a Role that can only access one Database >> >> 2. Allow user info to be dumped with a database, to make a db >> completely self-consistent >> >> 3. Allow databases to be transportable >> >> 4. Allow users to access tables in >1 database easily, with appropriate >> rights. > > The last requirement seems completely contradictory to the other three. > Either we're trying to make databases even more isolated as > multi-tenant Catalogs, or we're not. Trying to do both at the same time > is failure-by-design. Why is it OK to allow somebody to access multiple schema in one query, but not multiple databases? Are you arguing that schemas are also broken? I see no failure by design. I see an idea for greater ease of use being discussed. > Given that we offer schema as an alternative to multiple databases, and > users are starting to get used to them, I think that requirement (4) is > just a bad idea, and not worth pursuing, Personally, I have long recommended that people use schemas. But people do use databases and when they do they are pretty much screwed. I brought this up as a way of improving our ease of use. > except in the context of pgsql_fdw. That is not a realistic option. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
Stephen, > Which is "preferred" > when you do a 'grant select' or 'grant role'? The local role is preferred, the same way we allow objects in the local schema to overshadow objects in the global schema. > Or do we just disallow > overlaps between per-DB roles and global roles? If we don't allow > duplicates, I suspect a lot of the other questions suddenly become a lot > easier to deal with, but would that be too much of a restriction? The feature wouldn't be useful if we didn't allow conflicts between two local role names. However, we could prohibit conflicts between a local role name and a global role name if it made the feature considerably easier. Users would find workarounds which weren't too arduous. > How > would you handle migrating an existing global role to a per-database > role? Again, I think it would be OK not handling it. i.e., the user needs to do the following: 1. create a new local role 2. reassign all the objects belonging to the global role to the local role 3. drop the global role 4. rename the local role It'd be somewhat of a PITA, but I suspect that most people using the "local roles" feature would recreate their databases from scratch anyway. And we could offer some sample scripts for the above on the wiki and elsewhere. Obviously, a more elegant migration command would be ideal, but that could wait for the following PG release; we usually follow the "make things possible first, and easy later" plan anyway. Given that I'd love to have this feature, I'm trying to pare down its requirements to a managable size. Trying to do everything at once will only result in the feature stalling until 10.5. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 22, 2012 at 12:39 PM, Tom Lane wrote: > Robert Haas writes: >> If we're going to throw our current algorithm over wholesale, I'd >> rather use some approach that has been demonstrated to work well in >> other systems. Buffer eviction is a problem that's been around since >> the 1970s, and our algorithm is just about that old. > > Um, if you're claiming that that code dates from Berkeley days, you're > quite mistaken. Note the code: the algorithm. I believe that what we have implemented is GCLOCK, which is very old. In the interest of full disclosure, descriptions of exactly what GCLOCK is seem to vary a bit from paper to paper, but at least some papers describe the exact algorithm we use. > We adopted the clock sweep in 2005, after trying a few > other things whose performance was worse. I've not seen any argument in > this thread that suggests we should abandon clock sweep + usage counts > entirely. Rather, to me the issue is that we haven't completely gotten > rid of the last vestiges of the old global freelist approach. Well, I think that switching from one clock sweep to a clock sweep per backend would be basically an abandonment of the current approach. The results might be better or worse, but they'd surely be different. > BTW, it might be worth pointing out something I was trying to explain > to Amit at PGCon: the key reason that we went with usage counters rather > than something like a global LRU chain is that in the fast path where > ReadBuffer finds the requested block already in a buffer, it does not > have to contend for any global data structure to update the buffer's > usage information. It just has to bump the usage count in the buffer's > header. The free list, and the contention for BufFreelistLock, only > matter in the slow path where you're going to have to incur I/O anyway > (or at least a visit to the kernel). That seemed plenty good enough > in 2005. Our ambitions have now advanced further, so I'm on board with > trying to reduce contention here too, but I think it would be a mistake > to make the fast case any slower. Totally agreed. We're not the first people to think of this, either: CLOCK and GLOCK have been extensively studied and found to be almost as good as LRU in selecting good victim pages, but with less contention. That's why people are using them. Here's a paper that defines GLOCK to be the algorithm we use (page 2, second column, second paragraph from the bottom), and furthermore mentions PostgreSQL (top of page 3): http://staff.aist.go.jp/m.yui/publications/ICDE10_conf_full_409.pdf -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22 May 2012 14:05, Robert Haas wrote: > On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan wrote: >> That seems to be leaving aside the fact that we don't currently have any >> notion of how to allow FDWs to write the foreign tables. >> >> What is more, isn't the postgres FDW about talking to any postgres source? >> If so, does it have special smarts for when we are talking to ourselves? And >> if it doesn't then it seems unlikely to be an acceptable substitute for >> allowing talking direct to a sibling database. >> >> I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would >> adequately meet the case if we wanted to go that way. > > Well, I don't think anyone is claiming that FDWs as they exist today > solve all of the problems in this area. But I think that improving > FDWs is a more promising line of attack than trying to make backends > talk to multiple databases. Doing the latter will require massive > surgery on the relcache, the catcache, most of the secondary catalog > caches, the ProcArray, and every portion of the backend that thinks an > OID uniquely identifies an SQL object. Basically, they'd all need > database OID as an additional key field, which is undesirable for > performance reasons even if there were no issue of code churn. Ack, part from the bit about OIDs no longer being unique. That might be an upgrade issue but its obviously something we wouldn't allow if we did that. I'm not sure I see changing the caches as being massive surgery. Perhaps we could just bypass them altogether. > So I'd rather see us put the effort into pgsql_fdw, which, as Florian > says, will also let us talk to a completely separate server. If > you've got multiple databases in the same cluster and really need to > be doing queries across all of them, that's what schemas are > supposedly for. Now I know that people feel that doesn't work as well > as it needs to, but there again I think it would be easier to fix > schemas than to make cross-database queries work. We're a very long way from making that work well. IMHO easily much further than direct access. If I have a customer with 1 database per user, how do they run a query against 100 user tables? It would require 100 connections to the database. Doing that would require roughly x100 the planning time and x100 the connection delay. Larger SQL statements pass their results between executor steps using libpq rather than direct calls. I find it hard to believe FDWs will ever work sufficiently well to fix those problems. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add primary key/unique constraint using prefix columns of an index
Now that there are index only scans, there is a use case for having a composite index which has the primary key or a unique key as the prefix column(s) but with extra columns after that. Currently you would also need another index with exactly the primary/unique key, which seems like a waste of storage and maintenance. Should there be a way to declare a "unique" index with the unique property applying to a prefix of the indexed columns/expression? And having that, a way to turn that prefix into a primary key constraint? Of course this is easier said then done, but is there some reason for it not to be a to-do item? Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
> 1. Ability to have a Role that can only access one Database > > 2. Allow user info to be dumped with a database, to make a db > completely self-consistent > > 3. Allow databases to be transportable > > 4. Allow users to access tables in >1 database easily, with appropriate > rights. The last requirement seems completely contradictory to the other three. Either we're trying to make databases even more isolated as multi-tenant Catalogs, or we're not. Trying to do both at the same time is failure-by-design. Given that we offer schema as an alternative to multiple databases, and users are starting to get used to them, I think that requirement (4) is just a bad idea, and not worth pursuing, except in the context of pgsql_fdw. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How could we make it simple to access the log as a table?
On Mon, May 21, 2012 at 05:53:55PM -0700, Josh Berkus wrote: > Hackers, > > There's a lot of great information in the postgres logs. While we > eventually want to get more sophisticated about providing users with > status and history information, for 9.3 it would be really nice to > just offer the current logs in system view format. > > Certainly this can be done ad-hoc using CSV format and csv_fdw. > However, such setups are fragile due to log rotation and other > issues. It seems like we could come up with a better way. Ideas? Would it help to automate this stuff with that CSV format? The trick, as I see it, is to make those on-disk log files be partitions of a table. Stephen Frost's point is well-taken, but I'm not sure we need to make it a blocker for this. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heap metapages
On 22 May 2012 13:52, Robert Haas wrote: > It seems pretty clear to me that making pg_upgrade responsible for > emptying block zero is a non-starter. But I don't think that's a > reason to throw out the design; I think it's a problem we can work > around. I like your design better as well *if* you can explain how we can get to it. My proposal was a practical alternative that would allow the idea to proceed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exclusion Constraints on Arrays?
Robert Haas writes: > On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler > wrote: >> On May 13, 2012, at 3:45 PM, Robert Haas wrote: >>> It seems like maybe we could work around this by remembering the >>> contents of the pending list throughout the scan. Every time we hit a >>> TID while scanning the main index, we check whether we already >>> returned it from the pending list; if so, we skip it, but if not, we >>> return it. >> Should this go onto the To-Do list, then? > If someone other than me can confirm that it's not a stupid approach, > I would say yes. It seems probably workable given that we expect the pending list to be of fairly constrained size. However, the commit message referenced upthread also muttered darkly about GIN's partial match logic not working in amgettuple. I do not recall the details of that issue, but unless we can solve that one too, there's not much use in fixing this one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exclusion Constraints on Arrays?
On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler wrote: > On May 13, 2012, at 3:45 PM, Robert Haas wrote: > >> It seems like maybe we could work around this by remembering the >> contents of the pending list throughout the scan. Every time we hit a >> TID while scanning the main index, we check whether we already >> returned it from the pending list; if so, we skip it, but if not, we >> return it. > > Should this go onto the To-Do list, then? If someone other than me can confirm that it's not a stupid approach, I would say yes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
Robert Haas writes: > If we're going to throw our current algorithm over wholesale, I'd > rather use some approach that has been demonstrated to work well in > other systems. Buffer eviction is a problem that's been around since > the 1970s, and our algorithm is just about that old. Um, if you're claiming that that code dates from Berkeley days, you're quite mistaken. We adopted the clock sweep in 2005, after trying a few other things whose performance was worse. I've not seen any argument in this thread that suggests we should abandon clock sweep + usage counts entirely. Rather, to me the issue is that we haven't completely gotten rid of the last vestiges of the old global freelist approach. BTW, it might be worth pointing out something I was trying to explain to Amit at PGCon: the key reason that we went with usage counters rather than something like a global LRU chain is that in the fast path where ReadBuffer finds the requested block already in a buffer, it does not have to contend for any global data structure to update the buffer's usage information. It just has to bump the usage count in the buffer's header. The free list, and the contention for BufFreelistLock, only matter in the slow path where you're going to have to incur I/O anyway (or at least a visit to the kernel). That seemed plenty good enough in 2005. Our ambitions have now advanced further, so I'm on board with trying to reduce contention here too, but I think it would be a mistake to make the fast case any slower. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is indexonlyscan so darned slow?
On Mon, May 21, 2012 at 2:29 PM, Merlin Moncure wrote: > On Mon, May 21, 2012 at 4:17 PM, Jeff Janes wrote: >> >> For vaguely real life, take your example of pgbench -i -s200 -F 50, >> and I have 2Gig RAM, which seems to be the same as you do. >> >> With select only work load (pgbench -S -M prepared -T 30), I get >> >> tps = 193 >> >> But now enable index-only scans: >> >> psql -c "create index on pgbench_accounts(aid, abalance);" >> >> and it goes up to. >> >> tps = 10137 > > Right -- the main driver here is that your index fits neatly in ram > and the heap does not -- so you're effectively measuring the > difference between a buffered and non-buffered access. That's highly > contrived as you noted and unlikely to come up all *that* often in the > real world. I don't think this one is highly contrived. With the index being 10 fold smaller than the table, there is plenty of window for one to fit in RAM and the other one not to in a variety of real world situations. (Although I only get 10 fold window because of -F 50. I don't why Josh had that big of a different in his original, unless he also used a nondefault fill factor setting. But of course many real world tables will be wider than pgbench_accounts is.) The highly contrived example useful for dissecting the implementation would be to do: set enable_seqscan=off; set enable_indexonlyscan=off; select count(*) from pgbench_accounts where aid is not null; The key that I keep forgetting is the "where aid is not null'. Without that it uses the full table scan, even with enable_seqscan off, rather than doing an ordinary index scan. > Generally though the real world wins (although the gains will be > generally less spectacular) are heavily i/o bound queries where the > indexed subset of data you want is nicely packed and the (non > clustered) heap records are all over the place. By skipping the semi > random heap lookups you can see enormous speedups. I figure 50-90% > improvement would be the norm there, but this is against queries that > are taking forever, being i/o bound. > > See here: > http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm > for a 'in the wild' gripe about about not having index scans. But without scripts to recreate the data with the right selectivities and correlations, and to generate a long stream of appropriate query parameterizations so that they don't become cached, that is just a gripe and not an example. I tried to reproduce the problem as stated, and couldn't make IOS be useful because I couldn't make it be slow even without them. Presumably I'm doing something wrong, but how could I tell what? Have we heard back on whether IOS was tried and proved useful to the originator of that thread? If you want an example where neither index nor table fit in memory, then just bump up the scale to 2000---and get a machine with only 2G of memory if you don't already have one :) With the extra index in place: with enable_indexonlyscan=off tps = 155.1 with enable_indexonlyscan=on tps = 453.7 It seems like that should have only doubled, I'm not sure why it did more than double. Maybe the index became better cached when the table stopped competing with it for buffers. If you leave the select-only world and go back to doing updates, then that extra index is doing to hurt you somewhat, but if the dominant bottleneck is rpm of your WAL drive, it might not be noticeable. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exclusion Constraints on Arrays?
On May 13, 2012, at 3:45 PM, Robert Haas wrote: > It seems like maybe we could work around this by remembering the > contents of the pending list throughout the scan. Every time we hit a > TID while scanning the main index, we check whether we already > returned it from the pending list; if so, we skip it, but if not, we > return it. Should this go onto the To-Do list, then? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On May22, 2012, at 18:03 , Thom Brown wrote: > On 22 May 2012 16:57, Florian Pflug wrote: >> On May22, 2012, at 16:09 , Tom Lane wrote: >>> Thom Brown writes: Conflicts would occur where localrolename matches an existing local role name within the same database, or a global role name, but not a local role name within another database. The problem with this, however, is that creating global roles would need conflict checks against local roles in every database, unless a manifest of all local roles were registered globally. >>> >>> Yeah. The same type of issue arises for the roles' OIDs. You'd really >>> want local and global roles to have nonconflicting OIDs, else it's >>> necessary to carry around an indication of which type each role is; >>> which would be more or less a show-stopper in terms of the number of >>> catalogs and internal APIs affected. But I don't currently see any >>> nice way to guarantee that if each database has a private table of >>> local roles. >> >> Maybe we could simply make all global role's OIDs even, and all local ones >> odd, or something like that. > > Wouldn't that instantly make all previous versions of database > clusters un-upgradable? Only if pg_upgrade needs to preserve the OIDs of roles. I kinda hoped it wouldn't, because role OIDs aren't usually stored in non-system tables. Hm… thinking about this further, it'd actually be sufficient for all newly allocated role OIDs to follow the odd/even rule, if we additionally check for conflicts with existing global role OIDs when allocating the OID of a new local role. Which is much, much easier than checking for conflicts when allocating a global OIDs, because for that you'd have to check against the local role OIDs within *all* databases, not just against one shared table. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 22, 2012 at 12:11 PM, Tom Lane wrote: >> With respect to the control logic for the background writer, one idea >> I had was to get rid of the idea that the background writer's job is >> to write in advance of the strategy point. Instead, every time the >> clock sweep passes over a dirty buffer that is otherwise evictable, we >> add it to a queue of things that the bgwriter should clean. Those >> buffers, once cleaned, go on the free list. Maybe some variant of >> that could work with your idea. > > Both ends of that imply centralized data structures that could become > contention bottlenecks, so it doesn't sound tremendously appealing to > me. Maybe it can work as long as nobody has to lock the lists for long, > but I'd rather think of approaches with no central point of contention. If we're going to throw our current algorithm over wholesale, I'd rather use some approach that has been demonstrated to work well in other systems. Buffer eviction is a problem that's been around since the 1970s, and our algorithm is just about that old. I realize that there are (legitimate) concerns about what might be patented, but hopefully that doesn't mean we're not allowed to consult the literature in any way. If that were the case, we wouldn't have SSI. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
Thom Brown writes: > On 22 May 2012 16:57, Florian Pflug wrote: >> Maybe we could simply make all global role's OIDs even, and all local ones >> odd, or something like that. > Wouldn't that instantly make all previous versions of database > clusters un-upgradable? IIRC, pg_upgrade doesn't need to force role OIDs to be the same in the new cluster, so we could get away with this trick for the specific case of roles. It wouldn't work for introducing local/global versions of some other types of objects though. Another objection is that it wouldn't scale up nicely to multiple levels of catalog hierarchy. But maybe local/global is enough. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby PSQL 9.1 Windows 2008 Servers
Dear Robert, Thank you very much for the reply. You mean when the primary which is going to switch its role to standby might not have sent all the WAL records to the standby and If it is switched to standby it has more WAL records than the standby which is now serves as primary. Is it ?? It is actually the standby server which has to be restored from archive when it is switching to primary right .. Not the primary which is switching to standby ?? Regards, Reddy. -- View this message in context: http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637p5709495.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On May22, 2012, at 18:00 , Susanne Ebrecht wrote: > Am 22.05.2012 17:42, schrieb Tom Lane: >> Encoding yes, but since 9.1 we have pretty fine-grained control of >> collation. So I think this argument is a lot weaker than it used >> to be. It would only really apply if you have one of the corner >> cases where utf8 doesn't work for you. > > Yeah it got better - but it isn't perfect yet. Still, the whole reason that the encoding is a per-database property is that we don't tag each string with its encoding. To support cross-database queries, we'd have to do that, and could then just as well make the encoding a per-column property, or at least so I think. > Maybe I am blind or 9.1 documentation has a bug - but according to the > documentation you can't change default collation per schema or per table. > You can set collation per column - but do you really want to set collation for > every single column of every single supported language in your 200+ tables > web tool? > > That is a huge effort and a huge maintenance effort. You could always write at pl/pgsql function which iterates over all columns of type text or varchar within a schema and sets the desired collation, but > I would like to get default collation per schema / table in 9.2 or 9.3 but > that is my personal wish, yeah, that'd definitely be nice. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
Robert Haas writes: > On Tue, May 22, 2012 at 10:25 AM, Tom Lane wrote: >> My own thoughts about this had pointed in the direction of getting rid >> of the central freelist entirely, instead letting each backend run its >> own independent clock sweep as needed. > Hmm, that's certainly an interesting idea. I fear that if the clock > sweeps from the different backends ended up too closely synchronized, > you would end up evicting whatever was in the way, be it hot or cold. > It might almost be better to have individual backends choose buffers > to evict at random; if the chosen buffer isn't evictable, we decrement > its usage count and pick another one, also at random. Hmm ... yeah, in principle that could be better. It's still a clock sweep but following a hard-to-predict ordering of the buffers. Being hard to predict doesn't necessarily guarantee no bad behavior though. Maybe we could do something a bit cheaper than random() and more amenable to analysis, that would still ensure that backends couldn't end up with closely sync'd scans. I'm imagining advancing not 1 buffer each time, but K buffers where each backend will use a different value of K. Perhaps backend with backendID N could use the N'th prime number, say. You'd want something relatively prime to shared_buffers to guarantee that the backend can visit all the buffers, and just making it prime would do fine. > With respect to the control logic for the background writer, one idea > I had was to get rid of the idea that the background writer's job is > to write in advance of the strategy point. Instead, every time the > clock sweep passes over a dirty buffer that is otherwise evictable, we > add it to a queue of things that the bgwriter should clean. Those > buffers, once cleaned, go on the free list. Maybe some variant of > that could work with your idea. Both ends of that imply centralized data structures that could become contention bottlenecks, so it doesn't sound tremendously appealing to me. Maybe it can work as long as nobody has to lock the lists for long, but I'd rather think of approaches with no central point of contention. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Tue, May 22, 2012 at 12:00 PM, Susanne Ebrecht wrote: > Usually you want to set the collation once per language schema. E.g. schema > russian gets Russian collation and schema British gets British collation and > so on. > > CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and > do it by creating a database. > > I would like to get default collation per schema / table in 9.2 or 9.3 but > that is my personal > wish, Interesting idea. Sort of like ALTER DEFAULT PRIVILEGES. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On 22 May 2012 16:57, Florian Pflug wrote: > On May22, 2012, at 16:09 , Tom Lane wrote: >> Thom Brown writes: >>> Conflicts would occur where localrolename matches an existing local >>> role name within the same database, or a global role name, but not a >>> local role name within another database. The problem with this, >>> however, is that creating global roles would need conflict checks >>> against local roles in every database, unless a manifest of all local >>> roles were registered globally. >> >> Yeah. The same type of issue arises for the roles' OIDs. You'd really >> want local and global roles to have nonconflicting OIDs, else it's >> necessary to carry around an indication of which type each role is; >> which would be more or less a show-stopper in terms of the number of >> catalogs and internal APIs affected. But I don't currently see any >> nice way to guarantee that if each database has a private table of >> local roles. > > Maybe we could simply make all global role's OIDs even, and all local ones > odd, or something like that. Wouldn't that instantly make all previous versions of database clusters un-upgradable? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
Am 22.05.2012 17:42, schrieb Tom Lane: Encoding yes, but since 9.1 we have pretty fine-grained control of collation. So I think this argument is a lot weaker than it used to be. It would only really apply if you have one of the corner cases where utf8 doesn't work for you. Yeah it got better - but it isn't perfect yet. Maybe I am blind or 9.1 documentation has a bug - but according to the documentation you can't change default collation per schema or per table. You can set collation per column - but do you really want to set collation for every single column of every single supported language in your 200+ tables web tool? That is a huge effort and a huge maintenance effort. Usually you want to set the collation once per language schema. E.g. schema russian gets Russian collation and schema British gets British collation and so on. CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and do it by creating a database. I would like to get default collation per schema / table in 9.2 or 9.3 but that is my personal wish, Susanne -- Dipl. Inf. Susanne Ebrecht - 2ndQuadrant PostgreSQL Development, 24x7 Support, Training and Services www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 22, 2012 at 10:25 AM, Tom Lane wrote: >> The idea would be to have a background process (like bgwriter) >> maintain the global LRU state and push candidate buffers onto the >> freelist. > > Amit was trying to convince me of the same idea at PGCon, but I don't > buy it. bgwriter doesn't scan the buffer array nearly fast enough to > provide useful adjustment of the usage counts under load. And besides > if the decrements are decoupled from the allocation requests it's no > longer obvious that the algorithm is even an approximation of LRU. Well, bgwriter is *supposed* to anticipate which buffers are about to be allocated and clean any of those that are dirty. Having it decrement the usage counts and stuff the resulting list of buffers into a linked list seems like a pretty reasonable extension of that, assuming that it works in the first place. If it doesn't, then we need a rethink. > But the larger issue here is that if that processing is a bottleneck > (which I agree it is), how does it help to force a single process to > be responsible for it? Any real improvement in scalability here will > need to decentralize the operation more, not less. Sure. I think we could have the freelist and the clock sweep protected by different locks. The background writer would lock out other people running the clock sweep, but the freelist could be protected by a spinlock which no one would ever need to take for more than a few cycles. Right there, you should get a significant scalability improvement, since the critical section would be so much shorter than it is now. If that's not enough, you could have several freelists protected by different spinlocks; the bgwriter would put 1/Nth of the reusable buffers on each freelist, and backends would pick a freelist at random to pull buffers off of. > My own thoughts about this had pointed in the direction of getting rid > of the central freelist entirely, instead letting each backend run its > own independent clock sweep as needed. The main problem with that is > that if there's no longer any globally-visible clock sweep state, it's > pretty hard to figure out what the control logic for the bgwriter should > look like. Maybe it would be all right to have global variables that > are just statistics counters for allocations and buffers swept over, > which backends would need to spinlock for just long enough to increment > the counters at the end of each buffer allocation. Hmm, that's certainly an interesting idea. I fear that if the clock sweeps from the different backends ended up too closely synchronized, you would end up evicting whatever was in the way, be it hot or cold. It might almost be better to have individual backends choose buffers to evict at random; if the chosen buffer isn't evictable, we decrement its usage count and pick another one, also at random. With respect to the control logic for the background writer, one idea I had was to get rid of the idea that the background writer's job is to write in advance of the strategy point. Instead, every time the clock sweep passes over a dirty buffer that is otherwise evictable, we add it to a queue of things that the bgwriter should clean. Those buffers, once cleaned, go on the free list. Maybe some variant of that could work with your idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On May22, 2012, at 16:09 , Tom Lane wrote: > Thom Brown writes: >> Conflicts would occur where localrolename matches an existing local >> role name within the same database, or a global role name, but not a >> local role name within another database. The problem with this, >> however, is that creating global roles would need conflict checks >> against local roles in every database, unless a manifest of all local >> roles were registered globally. > > Yeah. The same type of issue arises for the roles' OIDs. You'd really > want local and global roles to have nonconflicting OIDs, else it's > necessary to carry around an indication of which type each role is; > which would be more or less a show-stopper in terms of the number of > catalogs and internal APIs affected. But I don't currently see any > nice way to guarantee that if each database has a private table of > local roles. Maybe we could simply make all global role's OIDs even, and all local ones odd, or something like that. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
Susanne Ebrecht writes: > The use case in my mind for accessing more databases is when you want to > access stuff different languages. > You only can set encoding / LC_Collate per database not per schema. > So for different languages you might need different databases to do > correct sorting / indexing. Encoding yes, but since 9.1 we have pretty fine-grained control of collation. So I think this argument is a lot weaker than it used to be. It would only really apply if you have one of the corner cases where utf8 doesn't work for you. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
Am 22.05.2012 15:27, schrieb Albe Laurenz: If you need different applications to routinely access each other's tables, why not assign them to different schemas in one database? The use case in my mind for accessing more databases is when you want to access stuff different languages. You only can set encoding / LC_Collate per database not per schema. So for different languages you might need different databases to do correct sorting / indexing. Susanne -- Dipl. Inf. Susanne Ebrecht - 2ndQuadrant PostgreSQL Development, 24x7 Support, Training and Services www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] weird error message in sepgsql
On Sun, May 20, 2012 at 11:28 AM, Tom Lane wrote: > Peter Eisentraut writes: >> I found this in contrib/sepgsql/expected/label.out: >> SECURITY LABEL ON COLUMN t2 >> IS 'system_u:object_r:sepgsql_ro_table_t:s0'; -- be failed >> ERROR: improper relation name (too many dotted names): > >> Contrast with: > >> SECURITY LABEL ON COLUMN t2.b >> IS 'system_u:object_r:sepgsql_ro_table_t:s0'; -- ok > >> I guess what's happening is that it's calling makeRangeVarFromNameList() >> with a list of length zero. > >> We should either fix the SECURITY LABEL command to catch that case >> beforehand, or fix makeRangeVarFromNameList() to give a proper error >> message, or both. > > I think the appropriate error message is probably along the lines of > "column name must be qualified", and it's hard to justify having > makeRangeVarFromNameList emit such a thing. So this is the fault > of the calling code. Fixed and back-patched to 9.1. Thanks for the report. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
Robert Haas writes: > On Tue, May 22, 2012 at 10:01 AM, Tom Lane wrote: >> Well, keep in mind that that action is not merely there to obtain a >> victim buffer; it is also maintaining the global LRU state (by >> decrementing the usage counts of buffers it passes over). I don't think >> you can change it to simply look only at a predetermined freelist >> without seriously compromising the overall quality of our buffer >> replacement decisions. > The idea would be to have a background process (like bgwriter) > maintain the global LRU state and push candidate buffers onto the > freelist. Amit was trying to convince me of the same idea at PGCon, but I don't buy it. bgwriter doesn't scan the buffer array nearly fast enough to provide useful adjustment of the usage counts under load. And besides if the decrements are decoupled from the allocation requests it's no longer obvious that the algorithm is even an approximation of LRU. But the larger issue here is that if that processing is a bottleneck (which I agree it is), how does it help to force a single process to be responsible for it? Any real improvement in scalability here will need to decentralize the operation more, not less. My own thoughts about this had pointed in the direction of getting rid of the central freelist entirely, instead letting each backend run its own independent clock sweep as needed. The main problem with that is that if there's no longer any globally-visible clock sweep state, it's pretty hard to figure out what the control logic for the bgwriter should look like. Maybe it would be all right to have global variables that are just statistics counters for allocations and buffers swept over, which backends would need to spinlock for just long enough to increment the counters at the end of each buffer allocation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes complete
On Mon, May 21, 2012 at 9:54 PM, Noah Misch wrote: > On Wed, May 09, 2012 at 11:11:02PM -0400, Bruce Momjian wrote: >> I have completed my draft of the 9.2 release notes, and committed it to >> git. > > Concerning "Have psql \copy use libpq's SendQuery()", SendQuery() is a > psql-internal interface, not a libpq interface. > > The array statistics patch added new columns to the pg_stats view, and it > moved existing tsvector most-common-element statistics to those new columns. > Let's mention that as a (minor) incompatibility. > > Proposed changes attached. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
* Thom Brown (t...@linux.com) wrote: > Conflicts would occur where localrolename matches an existing local > role name within the same database, or a global role name, but not a > local role name within another database. The problem with this, > however, is that creating global roles would need conflict checks > against local roles in every database, unless a manifest of all local > roles were registered globally. Hmm, right, that's a bit of a sticky point. wrt your suggestion- it works great if we don't allow duplicates, but then people have to accept their role name is getting changed. That said, perhaps that's not that big of a deal, since you could rename it afterwards. The issue with the constraints on other databases might actually be cause enough to allow duplicates, just to avoid that issue.. We could mirror the per-database roles into a shared space that isn't externally visible, but at that point, maybe we should try to get RLS for the catalogs instead, or just modify the views to only show roles which can connect to this database. That's not going to make them completely transparent, but it might be enough for some use cases. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Per-Database Roles
On Tue, May 22, 2012 at 9:37 AM, Thom Brown wrote: > On 22 May 2012 14:04, Stephen Frost wrote: >> What would the semantics of that look like though? Which is "preferred" >> when you do a 'grant select' or 'grant role'? Or do we just disallow >> overlaps between per-DB roles and global roles? If we don't allow >> duplicates, I suspect a lot of the other questions suddenly become a lot >> easier to deal with, but would that be too much of a restriction? How >> would you handle migrating an existing global role to a per-database >> role? > > Perhaps: > > CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ] > option [ ... ] ] > > Then: > > CREATE LOCAL ROLE localrolename LIKE globalrolename; > > REASSIGN OWNED BY globalrolename TO localrolename; > > Conflicts would occur where localrolename matches an existing local > role name within the same database, or a global role name, but not a > local role name within another database. The problem with this, > however, is that creating global roles would need conflict checks > against local roles in every database, unless a manifest of all local > roles were registered globally. There are race conditions to worry about, too. In most cases, we rely on the btree index machinery as a final backstop against duplicate catalog entries. But that doesn't work unless everything's in one catalog, nor for anything more complicated than "this set of columns taken together should be unique over every role". Even if we were OK with incurring the ugliness of storing per-database roles in a shared catalog, the uniqueness constraint you'd want is something like "no two roles can share the same name unless they have unequal database OIDs neither of which is zero", which I don't believe we can enforce via the btree machinery, at least not without an expression index that won't work in a system catalog anyway. In retrospect, I think the idea of shared catalogs was probably a bad idea. I think we should have made roles and tablespaces database objects rather than shared objects, and come up with some ad-hoc method of representing the set of available databases. But that decision seems to have been made sometime pre-1996, so the thought of changing it now is pretty painful, but I can dream... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
Thom Brown writes: > Conflicts would occur where localrolename matches an existing local > role name within the same database, or a global role name, but not a > local role name within another database. The problem with this, > however, is that creating global roles would need conflict checks > against local roles in every database, unless a manifest of all local > roles were registered globally. Yeah. The same type of issue arises for the roles' OIDs. You'd really want local and global roles to have nonconflicting OIDs, else it's necessary to carry around an indication of which type each role is; which would be more or less a show-stopper in terms of the number of catalogs and internal APIs affected. But I don't currently see any nice way to guarantee that if each database has a private table of local roles. You could possibly make it work if all roles, local and global, are stored in a single shared catalog. But that seems pretty ugly. BTW, I wonder whether this type of problem isn't also pretty fatal for the sorts of hierarchical catalog structures we were speculating about at PGCon. When we were talking about that I was sort of assuming that the more-closely-nested levels could just hide conflicting objects at outer levels, but on reflection that seems a bit scary. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 22, 2012 at 10:01 AM, Tom Lane wrote: > Robert Haas writes: >> Mind you, I think this whole area of the code needs some reengineering >> for better performance, but I'm not sure this is the right place to >> start. What I think is really bad is that we're forcing every >> BufferAlloc() to iterate over buffers checking whether each one is >> evictable. > > Well, keep in mind that that action is not merely there to obtain a > victim buffer; it is also maintaining the global LRU state (by > decrementing the usage counts of buffers it passes over). I don't think > you can change it to simply look only at a predetermined freelist > without seriously compromising the overall quality of our buffer > replacement decisions. The idea would be to have a background process (like bgwriter) maintain the global LRU state and push candidate buffers onto the freelist. Then foreground processes can just pop them off the list and recheck that they haven't been pinned meanwhile. As long as we don't let the background sweep get too far ahead of actual allocation needs, I don't think this would change the quality of buffer allocation much at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] incorrect handling of the timeout in pg_receivexlog
On Mon, May 14, 2012 at 2:24 PM, Fujii Masao wrote: > On Fri, May 11, 2012 at 11:43 PM, Magnus Hagander wrote: >> Should we go down the easy way and just reject connections when the flag is >> mismatching between the client and the server (trivial to do - see the >> attached patch)? > > + char *tmpparam; > > You forgot to add "const" before "char", which causes a compile-time warning. I went ahead and committed this, with this fix and a slight change to the message text. Hope that's OK with everyone... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
Robert Haas writes: > Mind you, I think this whole area of the code needs some reengineering > for better performance, but I'm not sure this is the right place to > start. What I think is really bad is that we're forcing every > BufferAlloc() to iterate over buffers checking whether each one is > evictable. Well, keep in mind that that action is not merely there to obtain a victim buffer; it is also maintaining the global LRU state (by decrementing the usage counts of buffers it passes over). I don't think you can change it to simply look only at a predetermined freelist without seriously compromising the overall quality of our buffer replacement decisions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "could not open relation with OID" errors after promoting the standby to master
On Tue, May 15, 2012 at 10:37 PM, Joachim Wieland wrote: > If it matters, I have not promoted the master with a trigger file but > restarted it after deleting recovery.conf. Hmm. I think that if you do it this way, the minimum recovery point won't be respected, which could leave you with a corrupted database. Now, if all the WAL files that you need are present in pg_xlog anyway, then they ought to get replayed anyway, but I think that if you are using restore_command (as opposed to streaming replication) we restore WAL segments under a different file name, which might cause this problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enhanced error fields
On Wed, May 9, 2012 at 9:33 AM, Pavel Stehule wrote: > here is patch with enhancing ErrorData structure. Now constraints > errors and RI uses these fields Please add this to https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence
On Tue, May 8, 2012 at 9:37 PM, Amit Kapila wrote: > I have checked the code and logic according to which usage counter is > increased when the buffer is pinned. Fixed, thanks for the report. > Another Doubt : Why in function BufferAlloc, it needs to hold the > BufFreelistLock till it pin the buffer which increases its reference count. Well, I think the problem is that, if we didn't do that, then, in theory, the strategy point could wrap all the way around shared_buffers and someone else could pin the buffer, and then we'd be hosed. Mind you, I think this whole area of the code needs some reengineering for better performance, but I'm not sure this is the right place to start. What I think is really bad is that we're forcing every BufferAlloc() to iterate over buffers checking whether each one is evictable. I think we ought to put only those buffers that we think are likely to be evictable on the freelist, and then the actual buffer eviction code would only need to recheck that nothing had changed, instead of needing to scan over a potentially quite large number of buffers that never had any chance of being selected in the first place. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-Database Roles
On 22 May 2012 14:04, Stephen Frost wrote: > What would the semantics of that look like though? Which is "preferred" > when you do a 'grant select' or 'grant role'? Or do we just disallow > overlaps between per-DB roles and global roles? If we don't allow > duplicates, I suspect a lot of the other questions suddenly become a lot > easier to deal with, but would that be too much of a restriction? How > would you handle migrating an existing global role to a per-database > role? Perhaps: CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ] option [ ... ] ] Then: CREATE LOCAL ROLE localrolename LIKE globalrolename; REASSIGN OWNED BY globalrolename TO localrolename; Conflicts would occur where localrolename matches an existing local role name within the same database, or a global role name, but not a local role name within another database. The problem with this, however, is that creating global roles would need conflict checks against local roles in every database, unless a manifest of all local roles were registered globally. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
Simon Riggs wrote: > On 21 May 2012 20:40, Stephen Frost wrote: > >>> This is important. I like the idea of breaking down the barriers >>> between databases to allow it to be an option for one backend to >>> access tables in multiple databases. > So collecting a few requirements from various places: [...] > * Allow users to access tables in >1 database easily, with appropriate rights. > The main objectives are to make a Database a more easily used > administrative grouping. At present, people who use multiple Databases > face many problems - they aren't as separate as you'd like, but > neither can they be ignored when required. > > The idea of "one main database per session" is fine, but wiring it so > closely into the backend has a few disadvantages, many of them weird > internal things. > > Are there arguments against those requirements before we spend time on > design/thinking? From my perspective it is a great advantage that a user connected to one database cannot access objects from a different database without additional software, no matter what permissions he or she has (short of superuser, who can do anything). This enables us to have many different databases in one cluster without having to worry that they can affect each other. If you need different applications to routinely access each other's tables, why not assign them to different schemas in one database? For those cases where you absolutely need access to a different database, you can use dblink or a foreign data wrapper (hopefully in 9.3). So -1 on that particular suggestion. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: catalog is missing 2 attribute(s) for relid 16584
On Mon, May 14, 2012 at 1:01 AM, Prakash Itnal wrote: > Recently we faced an issue with postgres server where it is throwing error: > > ERROR: catalog is missing 2 attribute(s) for relid 16584 > CONTEXT: automatic analyze of table "DBRNW.public.act_wsta" > > I checked in the database and found that this table is not present but the > entry for the same is present in 'pg_tables'. This error is occuring when > auto analyze is executed. > > I searched for the issue in postgres blogs and found that there is no way to > recover from this situation except restore from last backup!! I wanted to > confirm is there any way to recover at run time? because if we restore from > last backup then we might loose some data that got updated after this error > occurred. > > Any help would be appreciated. It sounds like you are missing a couple of pg_attribute entries. You might be able to figure out what's missing and put them back, but of course whatever got your system into this state might have also caused other corruption... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby PSQL 9.1 Windows 2008 Servers
On Mon, May 14, 2012 at 8:18 AM, chinnaobi wrote: > I do base backup only first time on standby when it is going to be > replicated. when ever primary goes down, standby becomes primary and > primary becomes standby when primary comes up. When primary becomes standby > I am restoring data from WAL archive and start postgres service streaming > replication to connect to primary. > > This setup is working. I don't think this is safe. The primary might have WAL that never made it to the standby, in which case the two machines will be out of sync with each other and all sorts of bad stuff could happen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan wrote: > That seems to be leaving aside the fact that we don't currently have any > notion of how to allow FDWs to write the foreign tables. > > What is more, isn't the postgres FDW about talking to any postgres source? > If so, does it have special smarts for when we are talking to ourselves? And > if it doesn't then it seems unlikely to be an acceptable substitute for > allowing talking direct to a sibling database. > > I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would > adequately meet the case if we wanted to go that way. Well, I don't think anyone is claiming that FDWs as they exist today solve all of the problems in this area. But I think that improving FDWs is a more promising line of attack than trying to make backends talk to multiple databases. Doing the latter will require massive surgery on the relcache, the catcache, most of the secondary catalog caches, the ProcArray, and every portion of the backend that thinks an OID uniquely identifies an SQL object. Basically, they'd all need database OID as an additional key field, which is undesirable for performance reasons even if there were no issue of code churn. So I'd rather see us put the effort into pgsql_fdw, which, as Florian says, will also let us talk to a completely separate server. If you've got multiple databases in the same cluster and really need to be doing queries across all of them, that's what schemas are supposedly for. Now I know that people feel that doesn't work as well as it needs to, but there again I think it would be easier to fix schemas than to make cross-database queries work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Per-Database Roles
* Simon Riggs (si...@2ndquadrant.com) wrote: > * Ability to have a Role that can only access one Database Alright, I'd like to think about this one specifically and solicit feedback on the idea that we keep the existing shared role tables but add on additional tables for per-database roles. In the past, I feel like we've been focused on the idea of moving all roles to be per-database instead of per-cluster, which certainly has a lot of problems associated with it, but in the end, I think people would be really happy with some shared roles and some per-DB roles. What would the semantics of that look like though? Which is "preferred" when you do a 'grant select' or 'grant role'? Or do we just disallow overlaps between per-DB roles and global roles? If we don't allow duplicates, I suspect a lot of the other questions suddenly become a lot easier to deal with, but would that be too much of a restriction? How would you handle migrating an existing global role to a per-database role? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] heap metapages
On Tue, May 22, 2012 at 4:52 AM, Simon Riggs wrote: > Based upon all you've said, I'd suggest that we make a new kind of > fork, in a separate file for this, .meta. But we also optimise the VM > and FSM in the way you suggest so that we can replace .vm and .fsm > with just .meta in most cases. Big tables would get a .vm and .fsm > appearing when they get big enough, but that won't challenge the inode > limits. When .vm and .fsm do appear, we remove that info from the > metapage - that means we keep all code as it is currently, accept for > an optimisation of .vm and .fsm when those are small enough to do so. Well, let's see. That would mean that a small heap relation has 2 forks instead of 3, and a large relation has 4 forks instead of 3. In my proposal, a small relation has 1 fork instead of 3, and a large relation still has 3 forks. So I like mine better. Also, I think that we need a good chunk of the metadata here for both tables and indexes. For example, if we use the metapage to store information about whether a relation is logged, unlogged, being converted from logged to unlogged, or being converted from logged to unlogged, we need that information both for tables and for indexes. Now, there's no absolute reason why those cases have to be handled symmetrically, but I think things will be a lot simpler if they are. If we settle on the rule that block 0 of every relation contains a certain chunk of metadata at a certain byte offset, then the code to retrieve that data when needed is pretty darn simple. If tables put it in a separate fork and indexes put it in the main fork inside the metablock somewhere, then things are not so simple. And I sure don't want to add a separate fork for every index just to hold the metadata: that would be a huge hit in terms of total inode consumption. > We can watermark data files using special space on block zero using > some code to sneak that in when the page is next written, but that is > regarded as optional, rather than an essential aspect of an > upgrade/normal operation. > > Having pg_upgrade touch data files is both dangerous and difficult to > back out in case of mistake, so I am wary of putting the metapage at > block 0. Doing it the way I suggest means the .meta files would be > wholly new and can be deleted as a back-out. We can also clean away > any unnecessary .vm/.fsm files as a later step. It seems pretty clear to me that making pg_upgrade responsible for emptying block zero is a non-starter. But I don't think that's a reason to throw out the design; I think it's a problem we can work around. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 05/22/2012 07:56 AM, Robert Haas wrote: On Tue, May 22, 2012 at 7:35 AM, Florian Pflug wrote: * Allow users to access tables in>1 database easily, with appropriate rights. That one I'm very sceptical about. In the long run, I think we want better separation of databases, not less, and this requirement carries a huge risk of standing in the way of that. Also, I think that once we integrate the postgres FDW into core (that's the plan in the long run, right?), we're going to get a good approximation of that essentially for free. +1. That seems to be leaving aside the fact that we don't currently have any notion of how to allow FDWs to write the foreign tables. What is more, isn't the postgres FDW about talking to any postgres source? If so, does it have special smarts for when we are talking to ourselves? And if it doesn't then it seems unlikely to be an acceptable substitute for allowing talking direct to a sibling database. I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would adequately meet the case if we wanted to go that way. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier
On Tue, May 22, 2012 at 1:50 AM, Tom Lane wrote: > Currently, the planner keeps paths that appear to win on the grounds of > either cheapest startup cost or cheapest total cost. It suddenly struck > me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor > fast-start preference, etc) we could know a-priori that cheapest startup > cost is not going to be interesting, and hence immediately discard any > path that doesn't win on total cost. > > This would require some additional logic to detect whether the case > applies, as well as extra complexity in add_path. So it's possible > that it wouldn't be worthwhile overall. Still, it seems like it might > be a useful idea to investigate. > > Thoughts? Yeah, I think we should investigate that. Presumably you could easily have a situation where one part of the tree is under a LIMIT or EXISTS and therefore needs to preserve fast-start plans but the rest of the (potentially large) tree isn't, so we need something fairly fine-grained, I think. Maybe we could add a flag to each RelOptInfo indicating whether fast-start plans should be kept, or something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On May22, 2012, at 13:47 , Simon Riggs wrote: > On 22 May 2012 12:35, Florian Pflug wrote: >>> * Allow users to access tables in >1 database easily, with appropriate >>> rights. >> >> That one I'm very sceptical about. In the long run, I think we want better >> separation of databases, not less, and this requirement carries a huge risk >> of standing in the way of that. Also, I think that once we integrate the >> postgres >> FDW into core (that's the plan in the long run, right?), we're going to get >> a good approximation of that essentially for free. > > It's a poor approximation of it, free or not. > > If it actually worked well, I'd be happy. It doesn't. No proper > transaction support, no session pool, poor planning etc etc. At best > its a band-aid, not something sufficiently good to recommend for > general production use. These all sound fixable, though. > FDWs are very good but aren't good enough for intra-database usage. OTOH, cross-database queries would by design be limited to databases within one cluster, whereas a FDW-based solution would not. I don't really see the different between telling people "put all your tables into one database if you need to access them from within one session" and "put all your databases on one server if you need to access them from within one session". Plus, the more tightly different databases in the same cluster are coupled, the more people will setup one cluster per database for performance reasons. Then, when they discovered they need inter-database queries after all, we'll again have to tell them "well, then don't set things up the way you have". If we want to make it easier for people to migrate from multiple databases to a single database with multiple schemas, maybe we should look into allowing nested schemas? AFAIK, the main reason not to do that are ambiguities in the meaning of identifiers, which cross-database queries would have to deal with also. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22/05/12 13:47, Simon Riggs wrote: On 22 May 2012 12:35, Florian Pflug wrote: * Allow users to access tables in>1 database easily, with appropriate rights. That one I'm very sceptical about. In the long run, I think we want better separation of databases, not less, and this requirement carries a huge risk of standing in the way of that. Also, I think that once we integrate the postgres FDW into core (that's the plan in the long run, right?), we're going to get a good approximation of that essentially for free. It's a poor approximation of it, free or not. If it actually worked well, I'd be happy. It doesn't. No proper transaction support, no session pool, poor planning etc etc. At best its a band-aid, not something sufficiently good to recommend for general production use. - Transaction support: should be fixed, I guess. - Session pool: is this really needed? I would it externally -- for example, by specifying a connection string to a pgPool/pgBouncer as opposed to directly to the db server. - Planning: add a tunable specifying a higher cost (with a exception for cluster-local dblinks, which would have a lower cost), and the rest should be straightforward. Of course, planning would'nt be as accurate ---we can't access the other db statistics in order to potentially rewrite conditions---, but I don't think that would be a proper approach (separation concerns, again) FDWs are very good but aren't good enough for intra-database usage. The optimization I just proposed (plus the required fixes to FDW) might very well solve this, while providing useful enhancements for all users, whether they are accessing cluster-local databases or not (or even truly foreign datasources) Regards, Jose Luis Tallon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22/05/12 13:24, Simon Riggs wrote: On 22 May 2012 12:05, José Luis Tallón wrote: IMVHO: s/database/schema/g does resolve many of the problems that you were referring to... and 'dblink' should solve the rest, right? Please, feel free to point out what I am (most probably) not considering -- not experienced enough yet :) The choice of schema/database is an important one. If you get it wrong, you are in major difficulty. In many cases schemas would be a better choice, but not in all cases. So I'm interested in solving the problems for people who have multiple databases on same server. Ok. Understood. Thank you for the clarification dblink is the only solution, but its very poor way to do this when we have 2 databases on same server. My thinking is that reaching out to multiple databases is actually mostly easy, except in a few places where dbid is hardwired into the backend. The only drawback I see is that it might weaken the separation. Even though arguably a kludge, dblink could have a "shortcut" added, whereby connections to another database within the same cluster would be serviced directly within the backend, as opossed to opening a new db connection. This is effectively a fastpath within dblink, which optimizes a relatively common case while at the same time not loosing generality. On the other hand, the separation of databases allows what otherwise would only be possible by using multiple instances of the database server (à la Oracle, AFAIK ) -- save for resource management, but that is another question whatsoever. Separation of databases is fine. I have no intention to change that, as long as the user wishes that. Perfect. Thanks, Jose Luis Tallon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Problem with error response message
Hi, ErrorResponse message from backend has some field type codes include: S Severity: the field contents are ERROR, FATAL, or PANIC (in an error message), or WARNING, NOTICE, DEBUG, INFO, or LOG (in a notice message), or a localized translation of one of these. Always present. C Code: the SQLSTATE code for the error (see Appendix A). Not localizable. Always present. If severity is ERROR, current transaction is aborted but existing session is continued. If it is FATAL or worse, backend disconnects the current session, or even postmaster restarts all backends (which results in session disconnection anyway). Cluster management tools such as pgpool-II has to carefully handle the message because there's big difference between just adborting transaction and disconnecting session. In the former, pgpool just forwards the error message to frontend, but for later, it should disconnect the session. For this purpose, pgpool-II examines SQLSTAE code. Unfortunately some error codes such as 40001(serialization_failure) comes with different severities ERROR or FATAL. Here are some examples: ./tcop/postgres.c- ereport(FATAL, ./tcop/postgres.c: (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), ./tcop/postgres.c-errmsg("terminating connection due to conflict with recovery"), ./tcop/postgres.c- errdetail_recovery_conflict())); ./commands/trigger.c- ereport(ERROR, ./commands/trigger.c: (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), ./commands/trigger.c- errmsg("could not serialize access due to concurrent update"))); ./commands/trigger.c- if (!ItemPointerEquals(&update_ctid, &tuple.t_self)) So I tried to look into severity code to find if it is a fatal error or not. Unfortunately it is almost impossible because it could be localized. My question is, why we need to localize this? It shouldn't be computer recognizable message? Also I wonder why "conflict with recovery" generates same error code as serialization error. For me it seems not very consistent choice. IMO, we should assign different error code for at least brings totally different effect to frontend: aborting transactions or sessions. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On Tue, May 22, 2012 at 7:35 AM, Florian Pflug wrote: >> * Allow users to access tables in >1 database easily, with appropriate >> rights. > > That one I'm very sceptical about. In the long run, I think we want better > separation of databases, not less, and this requirement carries a huge risk > of standing in the way of that. Also, I think that once we integrate the > postgres > FDW into core (that's the plan in the long run, right?), we're going to get > a good approximation of that essentially for free. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22 May 2012 12:35, Florian Pflug wrote: >> * Allow users to access tables in >1 database easily, with appropriate >> rights. > > That one I'm very sceptical about. In the long run, I think we want better > separation of databases, not less, and this requirement carries a huge risk > of standing in the way of that. Also, I think that once we integrate the > postgres > FDW into core (that's the plan in the long run, right?), we're going to get > a good approximation of that essentially for free. It's a poor approximation of it, free or not. If it actually worked well, I'd be happy. It doesn't. No proper transaction support, no session pool, poor planning etc etc. At best its a band-aid, not something sufficiently good to recommend for general production use. FDWs are very good but aren't good enough for intra-database usage. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On May22, 2012, at 11:46 , Simon Riggs wrote: > * Ability to have a Role that can only access one Database > > * Allow user info to be dumped with a database, to make a db > completely self-consistent These two could be achieved by having database-local roles I think. > * Allow databases to be transportable That would be very useful, but extremely hard to do unless we switch to per-database XID spaces. Or unless we're content with having databases only be transportable after some special "MAKE TRANSPORTABLE" operation, which would freeze all tuples and make the database read-only. > * Allow users to access tables in >1 database easily, with appropriate rights. That one I'm very sceptical about. In the long run, I think we want better separation of databases, not less, and this requirement carries a huge risk of standing in the way of that. Also, I think that once we integrate the postgres FDW into core (that's the plan in the long run, right?), we're going to get a good approximation of that essentially for free. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier
On 22 May 2012 12:12, PostgreSQL - Hans-Jürgen Schönig wrote: > > On May 22, 2012, at 9:57 AM, Simon Riggs wrote: > >> On 22 May 2012 06:50, Tom Lane wrote: >> >>> Currently, the planner keeps paths that appear to win on the grounds of >>> either cheapest startup cost or cheapest total cost. It suddenly struck >>> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor >>> fast-start preference, etc) we could know a-priori that cheapest startup >>> cost is not going to be interesting, and hence immediately discard any >>> path that doesn't win on total cost. >> >> My experience is that most people don't provide a LIMIT explicitly >> even when they know that's the desired behaviour. That's because >> either they simply don't understand that SQL can return lots of rows, >> or SQL knowledge isn't enough, or worse that people don't even know >> that specifying it would alter query plans. >> >> Regrettably the current planning of LIMIT clauses causes more problems >> so in many cases these have been explicitly removed from SQL by >> developers that know how many rows they wish to see. >> >> I would have proposed a default-LIMIT parameter before now, but for >> that last point. > > > this sounds like a total disaster to me ... > why in the world should we have a default LIMIT parameter? Its common for people to write SQL they want to have a fast response to, yet to not specify that in their SQL code. > i guess if somebody is not able to use LIMIT he should better not touch the > DB. > we clearly cannot fix incompetence by adding parameters. Not sure I understand that argument. Many things are set by default in parameters. Oracle allows the user to choose whether plans return rows quickly, or execute plans efficiently overall. That is a relevant user choice, not a hint. We simply don't know that if LIMIT is absent whether the user wishes fast or efficient plans. If they use LIMIT we know they want fast plan. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte
> Thanks for your comments. They clarify a lot. > But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2? > Isn't it possible for them to produce same pg_wchar? If LB is in 0x90 - 0x99 range, then they are LC2. If LB is in 0xf0 - 0xff range, then they are LCPRV2. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22 May 2012 12:05, José Luis Tallón wrote: > IMVHO: s/database/schema/g does resolve many of the problems that you were > referring to... and 'dblink' should solve the rest, right? > Please, feel free to point out what I am (most probably) not considering -- > not experienced enough yet :) The choice of schema/database is an important one. If you get it wrong, you are in major difficulty. In many cases schemas would be a better choice, but not in all cases. So I'm interested in solving the problems for people who have multiple databases on same server. dblink is the only solution, but its very poor way to do this when we have 2 databases on same server. My thinking is that reaching out to multiple databases is actually mostly easy, except in a few places where dbid is hardwired into the backend. > On the other hand, the separation of databases allows what otherwise would > only be possible by using multiple instances of the database server (à la > Oracle, AFAIK ) -- save for resource management, but that is another > question whatsoever. Separation of databases is fine. I have no intention to change that, as long as the user wishes that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier
On May 22, 2012, at 9:57 AM, Simon Riggs wrote: > On 22 May 2012 06:50, Tom Lane wrote: > >> Currently, the planner keeps paths that appear to win on the grounds of >> either cheapest startup cost or cheapest total cost. It suddenly struck >> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor >> fast-start preference, etc) we could know a-priori that cheapest startup >> cost is not going to be interesting, and hence immediately discard any >> path that doesn't win on total cost. > > My experience is that most people don't provide a LIMIT explicitly > even when they know that's the desired behaviour. That's because > either they simply don't understand that SQL can return lots of rows, > or SQL knowledge isn't enough, or worse that people don't even know > that specifying it would alter query plans. > > Regrettably the current planning of LIMIT clauses causes more problems > so in many cases these have been explicitly removed from SQL by > developers that know how many rows they wish to see. > > I would have proposed a default-LIMIT parameter before now, but for > that last point. this sounds like a total disaster to me ... why in the world should we have a default LIMIT parameter? i guess if somebody is not able to use LIMIT he should better not touch the DB. we clearly cannot fix incompetence by adding parameters. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changing the concept of a DATABASE
On 22/05/12 11:46, Simon Riggs wrote: On 21 May 2012 20:40, Stephen Frost wrote: This is important. I like the idea of breaking down the barriers between databases to allow it to be an option for one backend to access tables in multiple databases. The current mechanism doesn't actually prevent looking at data from other databases using internal APIs, so full security doesn't exist. It's a very common user requirement to wish to join tables stored in different databases, which ought to be possible more cleanly with correct privileges. That's really a whole different ball of wax and I don't believe what Robert was proposing would actually allow that to happen due to the other database-level things which are needed to keep everything consistent... That's my understanding, anyway. I'd be happy as anyone if we could actually make it work, but isn't like the SysCache stuff per database? Also, cross-database queries would actually make it more difficult to have per-database roles, which is one thing that I was hoping we might be able to work into this, though perhaps we could have a shared roles table and a per-database roles table and only 'global' roles would be able to issue cross-database queries.. IMVHO: s/database/schema/g does resolve many of the problems that you were referring to... and 'dblink' should solve the rest, right? Please, feel free to point out what I am (most probably) not considering -- not experienced enough yet :) On the other hand, the separation of databases allows what otherwise would only be possible by using multiple instances of the database server (à la Oracle, AFAIK ) -- save for resource management, but that is another question whatsoever. So collecting a few requirements from various places: * Ability to have a Role that can only access one Database Yes, please * Allow user info to be dumped with a database, to make a db completely self-consistent +1 * Allow databases to be transportable +1. Ideally, the binary format could be make platform-independent, so that a snapshot/rsync of the cluster can span architectures easily. AFAIK, endianness-change is relatively cheap on current processors [1 ASM instruction?] and it's not like we are memory-mapping tuples anyway (TOASTed values can certainly not be mapped), so it shouldn't be noticeable performance-wise. * Allow users to access tables in>1 database easily, with appropriate rights. See above, but I am probably wrong ... I don't see any reasons why these things would be against each other. Look quite orthogonal to me. The main objectives are to make a Database a more easily used administrative grouping. At present, people who use multiple Databases face many problems - they aren't as separate as you'd like, but neither can they be ignored when required. The idea of "one main database per session" is fine, but wiring it so closely into the backend has a few disadvantages, many of them weird internal things. Are there arguments against those requirements before we spend time on design/thinking? OTOH, the postmaster/cluster - session/database coupling looks to me clean, simple... and seems to make the code simpler. This is can only be good (but again, I don't know enough yet to be sure) Regards, Jose Luis Tallon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte
On Tue, May 22, 2012 at 11:50 AM, Tatsuo Ishii wrote: > > I think it's possible. The first characters are defined like this: > > #define IS_LCPRV1(c)((unsigned char)(c) == 0x9a || (unsigned char)(c) > == 0x9b) > #define IS_LCPRV2(c)((unsigned char)(c) == 0x9c || (unsigned char)(c) > == 0x9d) > > It seems IS_LCPRV1 is not used in any of PostgreSQL supported > encodings at this point, that means there's 0 chance which existing > databases include LCPRV1. So you could safely ignore it. > > For IS_LCPRV2, it is only used for Chinese encodings (EUC_TW and BIG5) > in backend/utils/mb/conversion_procs/euc_tw_and_big5/euc_tw_and_big5.c > and it is fixed to 0x9d. So you can always restore the value to 0x9d. > > > Also in this part of code we're shifting first byte by 16 bits: > > > > if (IS_LC1(*from) && len >= 2) > > { > > *to = *from++ << 16; > > *to |= *from++; > > len -= 2; > > } > > else if (IS_LCPRV1(*from) && len >= 3) > > { > > from++; > > *to = *from++ << 16; > > *to |= *from++; > > len -= 3; > > } > > > > Why don't we shift it by 8 bits? > > Because we want the first byte of LC1 case to be placed in the second > byte of wchar. i.e. > > 0th byte: always 0 > 1th byte: leading byte (the first byte of the multibyte) > 2th byte: always 0 > 3th byte: the second byte of the multibyte > > Note that we always assume that the 1th byte (called "leading byte": > LB in short) represents the id of the character set (from 0x81 to > 0xff) in MULE INTERNAL encoding. For the mapping between LB and > charsets, see pg_wchar.h. Thanks for your comments. They clarify a lot. But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2? Isn't it possible for them to produce same pg_wchar? -- With best regards, Alexander Korotkov.
[HACKERS] Changing the concept of a DATABASE
On 21 May 2012 20:40, Stephen Frost wrote: >> This is important. I like the idea of breaking down the barriers >> between databases to allow it to be an option for one backend to >> access tables in multiple databases. The current mechanism doesn't >> actually prevent looking at data from other databases using internal >> APIs, so full security doesn't exist. It's a very common user >> requirement to wish to join tables stored in different databases, >> which ought to be possible more cleanly with correct privileges. > > That's really a whole different ball of wax and I don't believe what > Robert was proposing would actually allow that to happen due to the > other database-level things which are needed to keep everything > consistent... That's my understanding, anyway. I'd be happy as anyone > if we could actually make it work, but isn't like the SysCache stuff per > database? Also, cross-database queries would actually make it more > difficult to have per-database roles, which is one thing that I was > hoping we might be able to work into this, though perhaps we could have > a shared roles table and a per-database roles table and only 'global' > roles would be able to issue cross-database queries.. So collecting a few requirements from various places: * Ability to have a Role that can only access one Database * Allow user info to be dumped with a database, to make a db completely self-consistent * Allow databases to be transportable * Allow users to access tables in >1 database easily, with appropriate rights. I don't see any reasons why these things would be against each other. The main objectives are to make a Database a more easily used administrative grouping. At present, people who use multiple Databases face many problems - they aren't as separate as you'd like, but neither can they be ignored when required. The idea of "one main database per session" is fine, but wiring it so closely into the backend has a few disadvantages, many of them weird internal things. Are there arguments against those requirements before we spend time on design/thinking? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heap metapages
On 22 May 2012 02:50, Robert Haas wrote: >> Not very sure why a metapage is better than a catalog table. > > Mostly because there's no chance of the startup process accessing a > catalog table during recovery, but it can read a metapage. OK, sounds reasonable. Based upon all you've said, I'd suggest that we make a new kind of fork, in a separate file for this, .meta. But we also optimise the VM and FSM in the way you suggest so that we can replace .vm and .fsm with just .meta in most cases. Big tables would get a .vm and .fsm appearing when they get big enough, but that won't challenge the inode limits. When .vm and .fsm do appear, we remove that info from the metapage - that means we keep all code as it is currently, accept for an optimisation of .vm and .fsm when those are small enough to do so. We can watermark data files using special space on block zero using some code to sneak that in when the page is next written, but that is regarded as optional, rather than an essential aspect of an upgrade/normal operation. Having pg_upgrade touch data files is both dangerous and difficult to back out in case of mistake, so I am wary of putting the metapage at block 0. Doing it the way I suggest means the .meta files would be wholly new and can be deleted as a back-out. We can also clean away any unnecessary .vm/.fsm files as a later step. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier
On 22 May 2012 06:50, Tom Lane wrote: > Currently, the planner keeps paths that appear to win on the grounds of > either cheapest startup cost or cheapest total cost. It suddenly struck > me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor > fast-start preference, etc) we could know a-priori that cheapest startup > cost is not going to be interesting, and hence immediately discard any > path that doesn't win on total cost. My experience is that most people don't provide a LIMIT explicitly even when they know that's the desired behaviour. That's because either they simply don't understand that SQL can return lots of rows, or SQL knowledge isn't enough, or worse that people don't even know that specifying it would alter query plans. Regrettably the current planning of LIMIT clauses causes more problems so in many cases these have been explicitly removed from SQL by developers that know how many rows they wish to see. I would have proposed a default-LIMIT parameter before now, but for that last point. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers