Re: [HACKERS] elegant and effective way for running jobs inside a database
Keep in mind that it's not about coding in C but mostly about figuring out what a sane design out to look like. While I can straddle the fence pretty my first reaction is that we are talking about application functionality that falls outside what belongs in core PostgreSQL. I'd rather see pgAgent be improved and act as a basic implementation while, for more complex use-cases, letting the community/marketplace provide solutions. Even with simple use-cases you end up having a separate process continually running anyway. The main benefit to linking with core would be the ability to startup that process after the server starts and shutdown the process before the server shutdown. That communication channel is something to consider outside this specific application and, if done, could be used to talk with whatever designated pgAgent-like application the user chooses. Other applications could also be communicated with in this way. Basically some form of API where in the postgres.conf file you specify which IP addresses and ports you wish to synchronize and which executable to launch just prior to communicating on said port. If the startup routine succeeds that Postgres will, within reason, attempt to communicate and wait for these external process to finish before shutting down. If the external application closes it should proactively notify Postgres that it is doing so AND if you startup a program manually it can look for and talk with a running Postgres instance. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parser - Query Analyser
On Nov 17, 2012, at 9:18, Michael Giannakopoulos miccagi...@gmail.com wrote: Hello guys, My name is Michail Giannakopoulos and I am a graduate student at University of Toronto. I have no previous experience in developing a system like postgreSQL before. What I am trying to explore is if it is possible to extend postgreSQL in order to accept queries of the form: Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM [database_name]; Anything is possible but what you are trying to do makes little sense generally and would take a tremendous amount of work to be done in PostgreSQL. The two main limitations are that you are creating a entirely new query language format and that the name of the database is constant and determined at the time of connection to the database. From a practical perspective I do not believe it (as written exactly above) can done without breaking existing functionality and/or introducing ambiguities. As I am not a PostgreSQL developer myself I cannot be of much more help but ISTM that providing more why and less what would get you better advice. As to learning how to contribute to the project I will let others point you to the existing resources that are out there. It would, however, probably help to explain what skills and background you already posses. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Nov 27, 2012, at 5:25, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: So my proposal for the current feature would be: ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ]; UPDATE MATERIALIZED VIEW mv; The choice of keywords and syntax here hopefully clearly hint the user about the locking behavior of the commands, too. And as we said, the bare minimum for this patch does *not* include the CONCURRENTLY option, which we still all want to have (someday). :) I dislike using ALTER syntax to perform a data-only action. The other advantage of non-functional syntax is that you could more easily supply some form of where clause should you only want to perform a partial refresh. With a function call that becomes more obtuse. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Dimitri Fontaine Sent: Tuesday, November 27, 2012 10:03 AM To: Kevin Grittner Cc: Pavel Stehule; Peter Eisentraut; Pgsql Hackers Subject: Re: [HACKERS] Materialized views WIP patch Kevin Grittner kgri...@mail.com writes: changing the structure of the table. Somehow I don't find that pursuasive as an argument for what ALTER MATERIALIZED VIEW should rescan the source relations and build a whole new set of data for exactly the same MV definition. Fair enough. Consider that in relational theory a table is considered a relation variable. ALTER is supposed to change the definition of the variable in some way. Other statements are used to change the value contained in the variable. Sure there are some grey areas already, but I don't see where we need to muddy the waters in this case. Under that light, using ALTER is strange indeed. I still don't like using LOAD that much, allow me to try a last syntax proposal. Well all I can find just now would be: UPDATE MATERIALIZED VIEW mv FOR EACH ROW; UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ]; The only value of such a proposal is that it's not LOAD and it's still not introducing any new keyword. Oh it's also avoiding to overload the SNAPSHOT keyword. Well, it still does not look like the best candidate. Regards, Just a thought but how about something like: DO REFRESH OF MATERIALIZED VIEW mat_view; In effect we begin overloading the meaning of DO to not only mean anonymous code blocks but to also call pre-defined internal routines that can be executed without having to use function-call syntax. MATERIALIZED VIEW can be more generic e.g., TABLE if the need arises, the REFRESH Action is generic, and additional clauses can be added after the object name (FOR, CONCURRENTLY, WHERE, etc...) David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] enhanced error fields
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Peter Geoghegan Sent: Monday, December 10, 2012 3:29 PM To: Pavel Stehule Cc: PostgreSQL Hackers; Alvaro Herrera; Tom Lane Subject: Re: [HACKERS] enhanced error fields Now, there are one or two places where these fields are not actually available even though they're formally required according to a literal reading of the above. This is only because there is clearly no such field sensibly available, even in principle - to my mind this cannot be a problem, because the application developer cannot have any reasonable expectation of a field being set. I'm really talking about two cases in particular: * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide schema_name and table_name in the event of domains. This was previously identified as an issue. If it is judged better to not have any requirements there at all, so be it. * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport call, we may not provide a constraint name iff a Constraint.connname is NULL. Since there isn't a constraint name to give even in principle, and this is an isolated case, this seems reasonable. Just skimming this topic but if these enhanced error fields are going to be used by software, and we have 99% adherence to a standard, then my first reaction is why not just supply Not Applicable (or Not Available as appropriate) instead of suppressing the field altogether in these (and possibly other, future) cases and make adherence for these fields 100%? From an ease-of-use aspect for the API if I can simply always query each of those fields and know I will be receiving a string it does at least seem theoretically easier to interface with. If I am expecting special string values (enclosed in symbols making them invalid identifiers) I can then handle those as desired without either receiving an error or a NULL when I go to poll the missing field if those couple of instances. I may be paranoid or mistaken regarding how this work but figured I'd at least throw it out for consideration. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Row Level Security
The more secure behavior is to allow entry of data which will not be visible by the person doing the entry. I don't think it is that simple. Allowing inserts without regard for row level restrictions makes it far easier to probe for data. E.g. by inserting rows and checking for unique violations. So the PK column(s) are not as secure as, say, the address-related column. Vice-versa I may know that someone lives at a given address (because my attempt to place someone else there failed) but I would have no way of knowing who that other person is. My recourse would be to escalate the data-entry request to someone with higher security permissions who could read and write to the appropriate tables and resolve the conflict. In both cases the direct write-only situation necessitates that some level of exposure occurs. The work-around if that is unacceptable would be to accept all data but any entries that cannot be directly inserted into the table would remain in a staging area that someone with higher security would have to monitor and clear as needed. The same intervention is required but in the first situation you can at least avoid coding the special logic and instead trade security for ease-of-use. As a default level of security we could throw a generic secure DLL rejected for ROW(...) and not tell the user anything about the cause. If that person knows all unique indexes and constraints defined on the table they could use trial-and-error to discover information about stored records but even then if they get an error on two different columns they still have no way of knowing if those errors belong to the same record. Beyond that level you provide the user with some information as to the cause so that they have a reasonable chance to catch typos and other mistakes instead of escalating an benign issue. Lastly is the custom solution whereby the developers accept ALL data entered as being correct but saved to a staging table. A review process by someone with higher security clearances would then process and clear out that table as necessary. If the user is write-only then regardless of whether the entry succeeded or failed they are considered to be done with their task at that point and no meaningful results from the system can be supplied to them. None of these options disallows the presence of non-security related check constraints to be checked, enforced, and communicated to the user. I've probably lost sight of the bigger picture as my response to mostly informed by these last couple of messages. David J. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/PgSQL STRICT
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Marko Tiikkaja Sent: Friday, December 21, 2012 10:53 AM To: Tom Lane Cc: PostgreSQL-development Subject: Re: [HACKERS] PL/PgSQL STRICT On 12/21/12 4:49 PM, I wrote: On 12/21/12 4:39 PM, Tom Lane wrote: What is the use-case for this? Currently, the way to do this would be something like: I realize I didn't really answer the question. The use case is when you're UPDATEing or DELETEing a row and you want to quickly assert that there should be exactly one row. For example, if you've previously locked a row with SELECT .. FOR UPDATE, and now you want to UPDATE or DELETE it, it better be there (or you have a bug somewhere). There had better be exactly one row - but who cares whether that is the row we were actually expecting to delete/update... I've recently had the experience of missing a WHERE pk = ... clause in an UPDATE statement inside a function so I do see the value in having an easy to implement safety idiom along these lines. Along the lines of EXPLAIN (options) CMD would something like UPDATE|DELETE (STRICT) identifier work? David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] count(*) of zero rows returns 1
Tom Lane-2 wrote Gurjeet Singh lt; singh.gurjeet@ gt; writes: Can somebody explain why a standalone count(*) returns 1? postgres=# select count(*); count --- 1 (1 row) The Oracle equivalent of that would be SELECT count(*) FROM dual. Does it make more sense to you thought of that way? I agree it's an odd thing for someone to query, but I feel it should return 0, and not 1. For that to return zero, it would also be necessary for SELECT 2+2 to return zero rows. Which would be consistent with some views of the universe, but not particularly useful. Another counterexample is regression=# select sum(42); sum - 42 (1 row) which by your argument would need to return NULL, since that would be SUM's result over zero rows. Given that: SELECT *; Results in: SQL Error: ERROR: SELECT * with no tables specified is not valid then an aggregate over an error should not magically cause the error to go away. I am curious on some points: Is there something in the standard that makes SELECT count(*) valid? What does SELECT * FROM dual in Oracle yield? Is there a meaningful use case for SELECT sum(42), or more specifically any aggregate query where there are no table/value inputs? I get the SELECT 2+2 and its ilk as there needs to be some way to evaluate constants. I get that the horse has already left the barn on this one but neither 0 nor 1 seem particularly sound answers to the question SELECT count(*). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/count-of-zero-rows-returns-1-tp5739973p5740160.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Covering Indexes
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of David E. Wheeler Sent: Tuesday, July 17, 2012 11:55 AM To: Simon Riggs Cc: Pg Hackers Subject: Re: [HACKERS] Covering Indexes On Jul 17, 2012, at 5:32 PM, Simon Riggs wrote: CREATE INDEX ON foo (a, b, c, d); allows SELECT c, d FROM foo WHERE a = ? AND b = ? to use an index only scan. The phrase unindexed seems misleading since the data is clearly in the index from the description on the URL you gave. And since the index is non-unique, I don't see any gap between Postgres and SQLliite4. Yeah, but that index is unnecessarily big if one will never use c or d in the search. The nice thing about covering indexes as described for SQLite 4 and implemented in MSSQL is that you can specify additional columns that just come along for the ride, but are not part of the indexed data: CREATE INDEX cover1 ON table1(a,b) COVERING(c,d); Yes, you can do that by also indexing c and d as of 9.2, but it might be nice to be able to include them in the index as additional row data without actually indexing them. Best, David Concretely, I would presume that the contents of a covering index could then look like the following (a,b,c,d): (2,1,2,A) (2,1,5,A) -- the 5 is out of natural order but exists in the covering part (2,1,3,A) Whereas PostgreSQL would be forced to have the index ordered as such: (2,1,2,A) (2,1,3,A) (2,1,5,A) Either way the data in c and d are IN THE INDEX otherwise in neither case could the data values be returned while strictly querying the index. So the question that needs to be asked is what kind of performance increase can be had during DML (insert/update) statements and whether those gains are worth pursuing. Since these other engines appear to allow both cases you should be able to get at least a partial idea of the performance gains between index (a,b,c,d) and index (a,b) covering (c,d). Vik's concurrent point regarding non-indexable values makes some sense but the use case there seems specialized as I suspect that in the general case values that are non-indexable (if there truly are any) are generally those that would be too large to warrant sticking into an index in the first place. But, XML values do ring true in my mind (particularly frequently used fragments that are generally quite small). But again whether that is a reasonable use case for a covering index I do not know. It feels like trying to solve the remaining 10% when it took a long while to even muster up enough support and resources to solve the 90%. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange evaluation Window function and SRF functions?
On Jul 30, 2012, at 12:33, Thom Brown t...@linux.com wrote: On 30 July 2012 17:19, Pavel Stehule pavel.steh...@gmail.com wrote: Hello I seen nice trick based on window function http://stackoverflow.com/questions/11700930/how-can-i-trim-a-text-array-in-postgresql but isn't it example of wrong evaluation? Result of row_number is not correct Looks right to me. I guess the way to get the row_number they're after out of the result set would involve changing OVER () to OVER (ORDER BY unnest(myTextArrayColumn)) The better way would be to perform the unnest in a sub-select then attach the row number in the outer select. David J.
Re: [HACKERS] temporal support patch
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Robert Haas Sent: Monday, August 20, 2012 5:04 PM To: Jeff Davis Cc: Vlad Arkhipov; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] temporal support patch On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis pg...@j-davis.com wrote: The other issue is how to handle multiple changes of the same record within the transaction. Should they be stored or not? In a typical audit log, I don't see any reason to. The internals of a transaction should be implementation details; invisible to the outside, right? I'm not convinced. Ideally the decision of whether to do so could be a client decision. Not storing intra-transaction changes is easier than storing all changes. At worse you could stage up all changed then simply fail to store all intermediate results within a given relation. It that case you gain nothing in execution performance but safe both storage and interpretative resources. So the question becomes is it worth doing without the ability to store intermediate results? If you were to ponder both which setup would the default be? If the default is the harder one (all statements) to implement then to avoid upgrade issues the syntax should specify that it is logging transactions only. Random, somewhat related, thought: I do all my working on a temporary staging table and then, as my final action, insert the resultant records onto a separate live table and drop the temporary table. Further changes to said record I perform by deleting the original then inserting a new record (from staging again) with all the values changed. Obviously this has limitations with respect to foreign keys and such but it is possible. What happens to the audit log if the PK changes and if it does not change? Any other implications that need to be address or is it like giving a loaded gun to someone and trust them to use is responsibily? I'm not sure that the database user is the proper thing to be stored in the history table. Many applications usually connect to a database using some virtual user and have their own users/roles tables to handle with privileges. There should be some way to substitute the stored user in the history table with the application's one. It's also helpful to store transaction id that inserted/updated/deleted the record. If the system is recording it for audit purposes, then it better be sure that it's true. You can't allow the application to pick and choose what gets stored there. That position would render this feature useless for every application for which I would otherwise have used it. I think it's just nonsense to talk about what we can or can't let the user do. The user is in charge, and our job is to allow him to do what he wants to do more easily, not to dictate what he must do. -- I see the user element as having two components: Client - what device/channel/user was used to connect to the database - PostgreSQL Role User - relative to that client which actual user performed the action - Application Specified A PostgreSQL role would correspond to client whereas the application would be allowed to have full control of what User value is stored. This gets a little complicated with respect to SET ROLE but gets close to the truth. The idea is that you look at the client to determine the namespace over which the user is defined and identified. So, a better way to phrase the position is that: You cannot allow the application to choose what is stored to identify itself (client) - i.e., its credentials identify who it is and those are stored without consulting the application At that point you've basically shifted responsibility for the correctness of the audit log onto that application and away from the database. However, you do provide a place for the application to store an identifier that it is able to resolve to a user if necessary. This is an arbitrary two-layer hierarchy and while conceptually anything with two layers may want more I am not sure whether the extra complexity that would entail would be worth the effort. Depending on what kinds of information you allowed to be stored for User it becomes something that can be modeled when desired and ignored otherwise. The issue with adding the PostgreSQL role to the database in this way is that you now can never delete that role or reassign it to another entity. I guess with temporal you could do so and basically have the identity-role relationship define over specific periods of time... I can (have) imagine a whole level of indirection and association to be able to reasonably handle assigning and storing permanent identities while allowing logon credentials to remain outside of permanent storage. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] temporal support patch
-Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Saturday, August 25, 2012 12:46 PM To: David Johnston Cc: Jeff Davis; Vlad Arkhipov; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] temporal support patch On Mon, Aug 20, 2012 at 7:17 PM, David Johnston pol...@yahoo.com wrote: Ideally the decision of whether to do so could be a client decision. Not storing intra-transaction changes is easier than storing all changes. Not really. If you don't care about suppressing intra-transaction changes, you can essentially just have a trigger that fires on every update and adds information to the side table. If you do care about suppressing them, you have to do something more complicated. Or so it seems to me. My internals knowledge is basically zero but it would seem that If you simply wanted the end-of-transaction result you could just record nothing during the transaction and then copy whatever values are present at commit to whatever logging mechanism you need. If you are recording intra-transaction values you could do so to a temporary storage area and then, at commit, decide whether the recent value for a given relation/attribute is going to be retained in the final log or whether you end up persisting all of the intermediate values as well. You cannot allow the application to choose what is stored to identify itself (client) - i.e., its credentials identify who it is and those are stored without consulting the application I don't think we can violate the general principle that the database super- user or table owner can do whatever they want. If one of those folks wants to falsify their history, are we really going to tell them no? To me that has I'm sorry, Dave, I can't do that written all over it, and I think we'll get about the same reaction that Hal did. Now, if user A is inserting into user B's table, and is not the super-user, then, of course, we can and should ensure that no falsification is possible. With respect to the physical log file there is no way for the super-user to currently falsify (at time of statement execution) the user/role that they are using. Even a SET ROLE doesn't change the session user (I forget the exact mechanics but I pretty sure on the general point). I do not see how this is that much different. I agree that it is pointless to even try to maintain true in-database auditing in the presence of god-like super-users so most of what I envision relates to limited permissioned users that are forced to rely upon the standard mechanisms provided by the database. As a matter of principle those wanting a secure and auditable environment should not be using ownership level roles. Since these temporal/audit tables are intended to be maintained by the system if you do not ask the users to identify themselves but instead take the information directly from the environment, you never have to give a I'm sorry Dave response because Dave is never given the chance to submit a proposed value. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?
-Original Message- There really needs to be some way to specify that when an expression is evaluated for each row in a set, a function used within that expression is not optimized away for some rows. Fortunately we have a way: http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html | VOLATILE indicates that the function value can change even within a | single table scan, so no optimizations can be made. Relatively few | database functions are volatile in this sense; some examples are | random(), [...] The behavior in the OP's query would certainly be sane if the function were not VOLATILE; as it is, I have a hard time seeing this as anything but a bug. What are the arguments against adding a 4th identifier - call it PER_ROW for this argument? The main reason VOLATILE is broken is that it is the default and in order to minimize beginner's penalty it is not treated as such in some situations. The new one could behave just like VOLATILE but would never be optimized away and would always evaluate once for each row in its context. Then the question is whether you write a new random() function or break backwards compatibility and alter the existing version. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?
-Original Message- | VOLATILE indicates that the function value can change even within a | single table scan, so no optimizations can be made. | Relatively few database functions are volatile in this sense; some | examples are random(), [...] What are the arguments against adding a 4th identifier - call it PER_ROW for this argument? The main reason VOLATILE is broken is that it is the default and in order to minimize beginner's penalty it is not treated as such in some situations. The new one could behave just like VOLATILE but would never be optimized away and would always evaluate once for each row in its context. So how would you document that? It sounds like the proposed level would behave exactly as the VOLATILE level is currently documented to behave; so I guess we could shift the documentation of VOLATILE to PER_ROW (or whatever). How would you then describe the behavior of VOLATILE? I'm not sure but however we would describe it we might as well make the change now regardless of whether another level is added. The main distinguishing characteristic is that VOLATILE is not guaranteed to evaluate once-per-row if it is not dependent upon particular values within a given row. VOLATILE: A Volatile function used in an ORDER BY or WHERE clause without referencing any columns from the query itself (i.e., no parameters or all constants) will be evaluated a single time and the result treated as a constant (i.e., all rows will have identical values) for that part of the query. PER_ROW: A per_row function will be evaluated once for every row that is visible to the function and will be treated as a virtual column of said relation with each cell having an its own value as a result of the function call. Using random() as an example of the two possible behaviors should further clarify the differences quite nicely. Quick pass - hopefully, a) this inspires someone else, and b) this is the correct understanding in the first place. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?
-Original Message- From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov] Sent: Wednesday, September 19, 2012 5:51 PM To: k...@rice.edu; David Johnston Cc: 'Florian Schoppmann'; 'Robert Haas'; pgsql-hackers@postgresql.org; 'Tom Lane' Subject: RE: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause? David Johnston pol...@yahoo.com wrote: VOLATILE: A Volatile function used in an ORDER BY or WHERE clause without referencing any columns from the query itself (i.e., no parameters or all constants) will be evaluated a single time and the result treated as a constant (i.e., all rows will have identical values) for that part of the query. I hope you're wrong about the ORDER BY part of that. A quick test confirms that it works in ORDER BY, at least for some cases. If there are any exceptions to that, I would sure like to know about it -- and really soon. select * from generate_series(1, 1) s(n) order by random() limit 10; -Kevin I'd rather have someone who knows the code assert one way or the other; I tossed it in there because I thought I've seen people complain that random() doesn't work as expected with ORDER BY but that may just be faulty memory. It may or may not depend on whether LIMIT/OFFSET are involved...? Used in the SELECT-list it gets evaluated for each row and I guess the ORDER BY could have that behavior as well (I would expect it to anyway), so is it strictly limited to WHERE clause evaluation that this discrepancy manifests? David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
I did like the alternative idea upthread of UUIDs for types which would give them a virtually unlimited space. Yeah, me too. That doesn't require a centralized authority (hence, no debates here about whether a given extension is important enough to merit an allocation of a given size), doesn't move us further in the direction of exposing the database's internal identifiers as a concept that users have to care about, ad provides essentially infinite address space. There's more engineering work involved but sometimes more engineering work means a better result. Random thought from the sideline... GIT is able to provide assurances as to content because it creates a hash. Now, for a function PostgreSQL could hash the catalog entry (including function body) and return than as proof that said function is the same as one installed in some other database or published publically. I do not know enough about the other objects to know if something similar is possible but maybe this will spark someone else's thoughts. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Noah Misch Sent: Tuesday, October 02, 2012 3:02 PM To: Craig Ringer Cc: PostgreSQL Hackers Subject: Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing? On Tue, Aug 21, 2012 at 02:31:29PM +0800, Craig Ringer wrote: It'd really help if REVOKE consistently raised warnings when it didn't actually revoke anything. +1 This will invite the same mixed feelings as the CREATE x IF NOT EXISTS notices, but I think it's worthwhile. Even better, a special case for REVOKEs on objects that only have owner and public permissions could say: WARNING: REVOKE didn't remove any permissions for user blah. This table/db/whatever has default permissions, so there were no GRANTs for user blah to revoke. See the documentation for REVOKE for more information. The extra aid from saying those particular things is not clear to me. It might be overkill, but we could report any other roles indirectly conveying access to the named role. Having been bitten by this myself I do see the value in such a warning. It is not uncommon for someone using REVOKE to believe they are installing a block instead of removing an allowance; especially as it interacts with default permissions. That said, and this is an off-the-cuff thought, the entire UI for permissions, and its treatment in the documentation, seems to be fact oriented. The system is well documented but actually getting up to speed to learn and use it is still a matter of reading the documentation and figuring out how everything fits together. I haven't given it that much thought but I am curious if others are of the same opinion. IOW, this proposal is an attempt to fix a symptom without addressing the root cause. Food for thought. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Andrew Dunstan Sent: Thursday, October 11, 2012 8:52 PM To: Daniel Farina Cc: Joshua D. Drake; Josh Berkus; Simon Riggs; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Deprecating RULES On 10/11/2012 08:20 PM, Daniel Farina wrote: On Thu, Oct 11, 2012 at 5:07 PM, Joshua D. Drake j...@commandprompt.com wrote: On 10/11/2012 03:59 PM, Josh Berkus wrote: I'm also not real keen on the idea that someone could dump a 9.2 database and be unable to load it into 9.3 because of the DDL trigger, especially if they might not encounter it until halfway through a restore. That seems rather user-hostile to me. Also, how would you picture that working with pg_upgrade? RULEs are a major feature we've had for over a decade. That nobody in the right mind would use in production for YEARS. That said there is a very real problem here. For a very, very long time the recommended way (wrong way in fact) to do partitioning was based on rules. Now, those in the know immediately said, WTF but I bet you that a lot of people that we don't know about are using rules for partitioning. We definitely need a warning period that this is going away. That said, I don't know that we need a whole release cycle. If we start announcing now (or before the new year) that in 9.3 we will not have rules, that gives people 9-10 months to deal with the issue and that is assuming that we are dealing with early adopters, which we aren't because early adopters are not going to be using rules. My experience suggests that only ample annoyance for at least one full release cycle will provide a low-impact switch. This annoyance must not be able to be turned off. Spot on. All our experience is that just announcing things, especially in places other than release notes and similar, is ineffective as a way of communicating with our user base. I'm with Tom and Josh and Daniel on this, and to be honest I'm somewhat surprised at the willingness of some people to spring surprises on users. I still come across uses of rules in the wild, and not just for partitioning either. Personally I think if we start now the earliest we should even consider removing the support is 9.4. cheers andrew Deprecation means that existing code will no longer work without refactoring. If CREATE RULE was a security hazard or unstable that may justify such an action but simply because using it properly (or at least safely) is difficult doesn't mean that those who have managed should be punished for their expertise. Late night rambling here but the risk mitigation that we seem to be caring about is new users searching for and using algorithms that they find on the web without understanding the intricacies of how those algorithms work. Do we really want to build something into the database to deal with this (by disallowing it outright) or do we do our best to provide authoritative and useful documentation so that when users go looking for the CREATE RULE command in our documentation they are provided with reasoning and alternatives to its use? RULEs may be difficult but maybe there are some rare use-cases where they would be appropriate. No one here is all-knowing and just maybe someone in the future will have an idea and decide to further improve them or at the least recognize a situation where the current implementation is useful. So, what actual harms are there to using CREATE RULE and are there less invasive means, via a more nuanced restriction implementation of CREATE RULE or simply via documentation, to mitigate those harms? Maybe there would not be enough benefits to CREATE RULE at this point in time to consider implementing in from scratch but given that it already exists it should be worth some effort to keep it functioning even if only for forward-compatibility reasons. And regardless, the whole what do you use instead of CREATE RULE documentation needs to be created no matter the eventual decision to fully remove the feature from the system. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential autovacuum optimization: new tables
On Oct 12, 2012, at 22:13, Stephen Frost sfr...@snowman.net wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: [ shrug... ] You're attacking a straw man, or more precisely putting words into my mouth about what the percentage-based thresholds might be. Notice the examples I gave involved update percentages quite far north of 100%. It's possible and maybe likely that we need a sliding scale. I was just discussing such a sliding scale approach w/ Josh on IRC, my thinking was that we could use a logarithmic approach based on table size. Also, I don't necessarily accept the conclusion you seem to be drawing, that it's okay to have complete turnover of a small table and not redo its stats. If you don't like the current behavior when there's no stats, why would you like the behavior when there are some stats but they no longer have the remotest relationship to reality? Josh's concern is about autovacuum causing lots of stats churn, which is understandable, we don't want it constantly rescanning a table, but perhaps we could use some kind of threshold for preventing autovac from rescanning a table it just scanned? Note that I did *not* say 'GUC', but I don't know what the 'right' answer is for how frequently is good-but-not-too-frequent. I'd also like to try and avoid adding GUCs. Instead of global could you attach an interface function to the table and have the auto-analyzer call that function to basically ask the table whether it needs to be analyzed? Still need to deal with defaults and provide a decent supply of built-in algorithms but at least the system can be made tunable. The default algorithm could maybe just handoff to a table size specific handler. The create table and alter table commands could be used to change the assigned algorithm if desired and new ones could be supplied via extensions. The 1000 row default seems unusual at first glance and contributes to the problem described. It is likely that the first I sent following the create table is going to be a bulk load if the table is going to have many rows. In the case where rows are inserted individually it is likely that the expected row count will be closer to 1 than 1000. One useful algorithm to provide the user is analyze on insert and, though maybe less so, analyze on update. So that any insert/update causes the table to be re-analyzed. Not a good default but, combined with delayed analyze logic to establish a minimum frequency, is a possible option for some use cases. Temporary table creation should have special attention given if changes are going to be made here. Another idea is to have system after [command] trigger(s) than can be used to call analyze without waiting for the auto-vacuum process. Provide some way for CREATE/ALTER TABLE and maybe auto-vacuum to enable and disable the trigger. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] pg_ping utility
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Tom Lane Sent: Monday, October 15, 2012 7:13 PM To: Andres Freund Cc: pgsql-hackers@postgresql.org; Thom Brown; Phil Sorber Subject: Re: [HACKERS] [WIP] pg_ping utility Andres Freund and...@2ndquadrant.com writes: Why not add a pg_ctl subcommand for that? For me that sounds like a good place for it... I think that's a bad fit, because every other pg_ctl subcommand requires access to the data directory. It would be very confusing if this one subcommand worked remotely when the others didn't. There was also some discussion of wedging it into psql, which would at least have the advantage that it'd typically be installed on the right side of the client/server divide. But I still think wedging into is the appropriate verb there: psql is a tool for making a connection and executing some SQL commands, and ping is not that. Yeah, I know a whole new executable is kind of a pain, and the amount of infrastructure and added maintenance seems a bit high compared to what this does. But a lot of the programs in src/bin/scripts are not much bigger. (In fact that might be the best place for this.) regards, tom lane This seems to be begging for a canonical pg_monitor command where pg_ping would be one sub-command. A bit much for a single command but it would provide a frame onto which additional user interfaces could be hung - though I am lacking for concrete examples at the moment. pg_monitor would be focused on database monitoring and not cluster monitoring generally but pg_ping would be a necessary pre-requisite since if the cluster is not available database monitoring doesn't make any sense. With the recent focus on pg_stat_statements and the current WIP on pg_lwlocks having an official UI for accessing much of this kind data has merit. Encapsulating the queries into commands makes actually using them easier and there can be associated documentation discussing how to interpret those specific commands and some level of consistency when asking for data for bug and performance reports. It may be that psql already does much of this as I am just not that familiar with the program but if that is the case then classifying it as making a connection and executing some SQL commands is a limited description. pg_ping is arguably doing at least the first part of that. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Abhijit Menon-Sen Sent: Sunday, October 21, 2012 5:45 AM To: Tom Lane Cc: P. Christeas; pgsql-hackers@postgresql.org Subject: [HACKERS] Re: [PATCH] Enforce that INSERT...RETURNING preserves the order of multi rows At 2012-10-17 09:56:22 -0400, t...@sss.pgh.pa.us wrote: Clarify that in the documentation, and also write a test case that will prevent us from breaking the rule in the future. I don't believe this is a good idea in the slightest. Yeah, the current implementation happens to act like that, but there is no reason that we should make it guaranteed behavior. I always thought it *was* guaranteed, and I've encountered code written by other people who were obviously under the same impression: take some strings (e.g. flag names), use insert … returning id, map the ids back to the names, and use the values in further inserts into other tables (flag_id foreign key references flags). I know one could say returning id, name, but there's certainly code out there that doesn't do this. I personally think the return order should be guaranteed; and if not, then the documentation urgently needs some prominent warnings to tell people that they should not assume this (for any variant of RETURNING). -- Abhijit Order is never guaranteed unless an ORDER BY clause is involved in processing the data immediately prior to its use. I could see this being in a Rules that you must always remember listing but to include it in every location where people might be inclined to rely upon ordering is just going to clutter the documentation. That said, I'm not personally opposed to this documentation suggestion. But while the idea is acceptable the actual changes proposed by someone's patch is what needs to be approved and applied. As to the order of RETURNING I do not see an overly compelling reason to enforce such a limitation; and in general implicit guarantees like this are undesirable since there is no way to turn them off. For sorting in particular the action itself can be expensive and not always needed. While we are not talking strictly sorting here (just maintained order) the concept still applies. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Merlin Moncure Sent: Monday, October 22, 2012 6:54 PM To: Robert Haas Cc: Andrew Dunstan; Josh Berkus; Daniel Farina; pgsql- hack...@postgresql.org Subject: Re: [HACKERS] Deprecating RULES Good point on the CTE (and it's correct). I think by any reasonable definition rules are in fact already de facto deprecated: they are not being extended to interact with other features and the community is advising against their use. I don't think anybody would complain if/when a hypothetical MERGE feature was advanced without rule interaction. That said, I don't think there is any reasonable argument to remove rules. Backwards compatibility should only be broken when it *must* be broken. Any 'developer interest only' standards ('grotty code', 'inelegant', 'ill advised for new code', etc) of removal are completely specious and thus are IMSNHO irrelevant. merlin While I agree with this sentiment to some degree in order for the community to thrive new developer blood needs to be introduced periodically. Not that this feature is particularly an issue but making the codebase easier to learn and maintain has considerable value in its own right. To put a different spin on things it is like CREATE RULE is a specialty tool. Taken that way we should strictly describe the uses-cases where CREATE RULE behavior is well-defined and problem free. If the end-user isn't trying to use RULEs in exactly those cases then they are advised to attempt another solution or send an e-mail to the list to get some expert opinions on that particular use-case. Known problematic uses can also be listed to minimize the amount of not listed, what do y'all think e-mails sent to the list. In this setup there is some developer obligation to try and not break those well-defined use-cases; but that exists today even if it is not explicitly mentioned. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Creating indexes in the background
On Oct 27, 2012, at 19:20, David Lee davidomu...@gmail.com wrote: Hey folks, It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancel the statement (i.e. you must keep your statement open). Is there a way to launch an index creation in the background so that the statement doesn't need to be kept open? --David No there is not. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add contrib module functions to docs' function index
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Craig Ringer Sent: Tuesday, November 13, 2012 6:15 PM To: PostgreSQL Hackers Subject: [HACKERS] Add contrib module functions to docs' function index Hi all Andreas Heiduk on -bugs suggested that we add the functions provided by contrib modules to the function index in the docs, so it's easier to go from, say, what the heck is idx(...) to finding it in the intarray contrib module. This seems like a good idea and I'd like to pop it in the TODO until I get time to check it out, flagged as a minor/newbie-friendly problem. Any objections? For clarity does this proposal refer to Chapter 9 of the documentation, the Index, or both. If modifying Chapter 9 the function and operator tables should be extended to include a source column with values of base or contrib: module name or something similar. As to the desirability of such a change I concur that it would be a nice usability enhancement to consider beyond just updating the actual Index. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching to Homebrew as recommended Mac install?
On Apr 1, 2012, at 21:50, Jay Levitt jay.lev...@gmail.com wrote: Tom Lane wrote: While you might not like the EDB installer, at least those folks are active in the lists and accountable for whatever problems their code has. Who in heck is responsible for the homebrew packaging, and do they answer questions in the PG lists? Just for general knowledge... Who's responsible is whoever wants to be; homebrew is open source, and with a github-based workflow, it's trivial for them to accept pull requests. On the 1967 formulas (packages) in the repo, there have been 1759 contributors. I was volunteering to be the maintainer and liaison if we did this; I'm pretty good at email and IRC. It's actually pretty clever and elegant - homebrew itself uses git and github for formula updates and distribution, and is written in ruby1.8 which ships with all Macs. /usr/local is a git repo, brew update is mostly git pull, and brew search checks for new pull requests if it doesn't find a matching formula. The whole thing's all of 1500 lines of code, and you saw what formulas look like. Jay You seem highly approving of homebrew and seem willing to develop and support it. I guess the question to be asked is what requirements you would expect to have to meet before the Mac Downloads section would list your installer routine along with the three already present? Aside from that unless you are really intent on trying to prove yourself to be the best if you are trying to overcome shortcomings of the existing installers it would still be nice to let them know how you feel things could be improved for the community/user sub-set you belong to. As a Windows developer (though production is on Linux) I get where you are coming from with respect to user permissions and the like - what is desirable in a development and in production do differ and so having different installation routines for them makes some sense. Until your developers go to install on the production server and do not realize that they should be doing something different in order to make the server more secure than their development environment. From what I follow I think you have really good ideas and sound reasoning. You do not need permission to contribute to the community in the way you seek so what is it that you are really asking for? From the sound of things your primary focus is not in supporting the PostgreSQL community via providing services to others or developing new tools. When brew is replaced by something more popular do you think you will continue to maintain the recipie or is it going to end up stuck showing us how to install version 9.3 or earlier. I'm beyond my element here but the current installer maintainers are doing so in addition to their other, more regular, contributions. That said, the contribution, even if it did stall in the future, would still be welcomed and if it is found to be very useful someone would likely pickup the torch as long as it is released under the same terms as PostgreSQL itself. Just trying to bridge an apparent gap since the original e-mail seems to have come across as too adversarial that the underlying thoughts have been overlooked. Trying to contribute in my own way with my current resources. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching to Homebrew as recommended Mac install?
On Apr 1, 2012, at 13:14, Jay Levitt jay.lev...@gmail.com wrote: The Mac installation docs currently recommend the EDB one-click installer as the first choice. While this does install pgadmin and some other refinements, it also is fairly confusing to troubleshoot: The items are not numbered and it is impossible to avoid special ordering. There are three options - and yes EDD is listed at the top of the page - but nothing else implies any kind of order and given that they are dealing with different ways to package if someone prefers MacPorts or Fink the fact they are listed lower shouldn't induce them to pick the unfamiliar one first. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2 release notes, beta time?
On Apr 27, 2012, at 21:24, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: On Fri, Apr 27, 2012 at 09:10:54PM -0400, Tom Lane wrote: How would that help? The bottleneck is packaging, which is the same work whatever we call it. We release the alpha with no packaging. Think we'd lose a lot of potential testers that way. It seems that even if you published before PGCon a reasonable number of possible testers would be there and/or responding to any feedback is going to be delayed since developers are going to be there. I could see the goal being having a package-ready commit before PGCon but schedule the official release until after. No matter what you call it those who want a head start can self-compile while others will just wait until the packagers are done. Either way the same codebase will be in the wild (so probably don't want to call it alpha). The original reason for pre-conference is so the developers can feel less bad talking about 9.3 features (and 9.2 post-mortem) since the beta for 9.2 will be completed. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] precision and scale functions for numeric
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Robert Haas Sent: Monday, April 30, 2012 2:20 PM To: Peter Eisentraut Cc: pgsql-hackers Subject: Re: [HACKERS] precision and scale functions for numeric I think you could test for integer-ness by testing whether val % 0 = 0. Either I am missing something here or you are. Since Modulus is a division function anything % 0 results in a division-by-zero ERROR - division has to occur before a remainder can be obtained. Maybe val % 2 NOT IN (0,1) ... David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: additional error fields
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Peter Geoghegan Sent: Tuesday, May 01, 2012 4:37 PM To: Tom Lane Cc: Pavel Stehule; PostgreSQL Hackers Subject: Re: [HACKERS] proposal: additional error fields On 1 May 2012 21:14, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan pe...@2ndquadrant.com writes: Maybe no one is convinced by any of this, but the fact is that the SQLSTATE argument falls down when one considers that we aren't using it in many cases of errors that clearly are severe. The reason that argument isn't convincing is that we *are* using a SQLSTATE for every such message; it's just defaulted to XX000. AFAICT, it would be reasonable to treat all XX000 as alarm conditions until proven different. If a given message is, in fact, not supposed to be can't happen, then it shouldn't be going through elog(). We'd probably be needing to fix some places that were lazily coded as elogs, but under your proposal we would also have to touch every such place ... and thousands more besides. Fair enough. Adjusting all of those elog calls may be excessive. The argument could be made that what I've characterised as severe (which is, as I've said, not entirely clear-cut) could be deduced from SQLSTATE if we were to formalise the can't happen errors are only allowed to use elog() convention into a hard rule. However, I think it's critically important to make all of this easy and well-documented. Severity should probably be part of the default log_line_prefix. Sorry for high-jacking your thread, Pavel. So the apparent desire is to promote proper usage of SQLSTATE but simultaneously add and encode a default SQLSTATE_PG_SEVERITY value for each class/code that can be used for external monitoring and notification. Ideally customization could be done so that differing opinions on such severity classification could be made on a client-per-client basis without having to resort to outputting the SQLSTATE code itself and then requiring external software to maintain such an association. To that end any severity on the class itself would act as a default and specific codes that want to share the same severity can be skipped while those needing a different code can have an override specified. Since the codes are neither exhaustive nor mandatory such a default would apply to any user-chosen code not previously defined. Simply adding in more high-level categories avoids the issue that the current system has insufficient information encoded to facilitate desired reporting requirements. If we encode our messages with a sufficient level of detail then internally or externally adding categories and meta-data on top of those layers is simple and new ideas and techniques can be tried without having to modify the system in the future. Supplemental context information such as table and constraint names can be useful if the cost of recording such data is low enough and the value sufficient. That said, knowing the SQL that caused the error and which process it is implementing should be sufficient to identify possible causes and resolutions without requiring the specific columns and tables involved. Since constraint violations already expose the name of the violated constraint that particular situation seems to have a sufficient solution. Given that you should not be giving end-users that kind of implementation artifact anyway the developer and DBA should be able to identify the root cause and either avoid it themselves or code an application interface to present to the end-user. So, at least from my perspective, the bar to move this forward is pretty high - either it must be fairly simple to implement (which it is not) or there needs to be more value to it than I am seeing currently. This ignores whether normal runtime performance costs will be a significant factor. Looking at the SQLSTATE error classes I am somewhat concerned with the number of items found under HV and the apparent intermixing of client and internal error types. As for an upgrade path how about something along the lines of: 1) Make a best-attempt effort at identifying existing elog and ereport calls and modifying them to output specific SQLSTATE codes 2) Modify elog/ereport to catch and log (stack trace) any calls that do not set SQLSTATE to a specific value. 3) During development, beta, and RC phases keep such code in place and ask people to look at their logs for missed elog/ereport calls 4) Remove the stack trace (logging) within ereport/elog from the final released code David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Hannu Krosing Sent: Tuesday, May 01, 2012 5:29 PM The reason I am whining now is that with minor adjustments in implementation it could all be made much more powerful (try cast to ::json for values before cast to ::text) and much more elegant thanks to PostgreSQL's built in casting. If we allowed json to hold any JSON value and tried ::json when generating json for compound types than we would be able to claim that PostgreSQL supports JSON everywhere, defaulting to representing officially unsupported types as strings, but allowing users to convert these to their preferred conventions. I get that a JSON Text is always also a JSON Value but the reverse is not true. Thus, if we define JSON to be JSON Value we cannot guarantee that the encoded value is a possible JSON Text - the most important property for purposes of data interchange. I'd also prefer to have default conversions already included for some of our sexier types, like intervals (just a two element array) and hstore (an object) etc. Interval is not simply 2 values but also denotes whether the particular value is inclusive or exclusive; you would have to use an object unless you transmit in a text format and let the target perform the necessary interpretation of the string. Suddenly we would be the best match database for Web development and all things Ajax and also have a widely used built in and adjustable interchange format to outer world. Second, RFC 4627 is absolutely clear: a valid JSON value can only be an object or an array, so this thing about converting arbitrary datum values to JSON is a fantasy. Probably a typo on your part - valid JSON _text_ is object or array, valid JSON value can also be number, text, true, false and null What I am arguing for is interpreting our json type as representing a JSON value not JSON text, this would enable users to adjust and extend the generation of json values via defining casts for their specific types - most notably Date* types but also things like hstore, which has a natural JSON representation as object (a list of key:value pairs for non-js users, a.k.a. a dictionary, hash, etc.) Aside from the fact it is likely too late to change the interpretation I would argue against doing so in any case. Currently, the idea is to get your result all lined up and ready to go and then ship it off to the caller as valid JSON so that the caller does not have to do so itself. Answering the question what would this value look like if it was part of a json output? is good; however, production use is likely to mostly care about the entire json interchange construct (i.e., JSON Text) So: json - json_text; A JSON Value always has a textual representation but if we were to have an actual type it would make sense to encode it such that (strings, objects and arrays) are delimited while (numbers, false, true, and null) are not. Type Name: json_value Output Representations (all output surrounded by double-quotes since all are string-like) - String: 'VALUE' (single-quote delimiter) Object: {...} Array: [] Number: 0.00 Other: false, true, null JSON is fundamentally an interchange format (especially from a database's perspective). JSON Values only really have meaning if they are attached explicitly to a JSON Text structure, if you wanted to store one independently you should convert it into a native representation first. The few exceptions to this would be sufficiently handled via plain text with meta-data indicating that the stored value is structured in directly JSON compatible syntax. In short, the default context for JSON in PostgreSQL should JSON Text (not JSON Value) and thus the unadorned json should reflect this default (which it does). If anything, we should adjust the JSON input routines to disallow anything else, rather than start to output what is not valid JSON. I tested python, ruby and javascript in firefox and chrome, all their JSON generators generate 1 for standalone integer 1 and a for standalone string a , and none refused to convert either to JSON. ^Assume that we keep the meaning of json to be JSON Text; what would you suggest occurs if someone attempts a datum - json cast? Given that we are working in a strongly-typed environment the meaning of JSON cannot be changed and so either the cast has to output valid JSON Text or it has to fail. My personal take it is have it fail since any arbitrary decision to cast to JSON Text is going to make someone unhappy and supposedly they can modify their query so that the result generates whatever format they desire. I haven't followed the JSON development in 9.2 too closely but exposing whatever conversion mechanism is currently used to generate JSON makes sense from a ease-of-development standpoint. But even then, during
Re: [HACKERS] proposal: additional error fields
On May 1, 2012, at 20:05, Robert Haas robertmh...@gmail.com wrote: On Tue, May 1, 2012 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: I continue to maintain that the SQLSTATE is a much better basis for solving this problem. Its categories are already pretty close to what Peter needs: basically, IIUC, he wants to know about classes 53, 58, maybe F0, and XX. This is really too mushy, IMHO. ERRCODE_TOO_MANY_CONNECTIONS isn't what I'd call an oh-shit condition even though it's in class 53, but this could not create archive status file \%s\ is definitely an oh-shit regardless of what errcode_for_file_access() returns. Also, the fact is that most people do not log SQLSTATEs. And even if they did, they're not going to know to grep for 53|58|maybe F0|XX. What we need is an easy way for people to pick out any log entries that represent conditions that should never occur as a result of any legitimate user activity. Like, with grep. And, without needing to have a PhD in Postgresology. If you want something really simple why not output all elog calls to one file and ereport calls to the current log? If you recognize the need to fix existing code so that you can determine the severity levels you desire then go all the way and use SQLSTATE at the call level and then add meta-data about those codes higher up. That meta-data is then customizable so those who want the too many connections error can see them while those that do not can turn them off. With the addition of the PostgreSQL specific severity category both that value and the SQLSTATE upon which it is based should be something that is considered best practice to output (and the default) and future attention should be given to ensuring that the code is as accurate as possible. Since existing log formats would still be valid upgrades should not be an issue. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?
On May 1, 2012, at 20:41, Hannu Krosing ha...@2ndquadrant.com wrote: Most people don't work in strongly-typed environment, and thus would work around such restriction if they need a simple JSON value at the other end of the interchange. My personal take it is have it fail since any arbitrary decision to cast to JSON Text is going to make someone unhappy and supposedly they can modify their query so that the result generates whatever format they desire. Do you actually have such an experience or is it just a wild guess ? So even given the semantic differences between an object and a scalar I am better understanding where interpreting JSON as JSON Value makes sense. However, if I convert a record or array to JSON I expect to get a JSON Text even if the there is only a single column or value in the input. I guess my take is that record - JSON text while anything else is JSON value. Whether it is worth maiming the special case for record is worthwhile I really do not know but the semantic difference does exist; and record output is a significant aspect of PostgreSQL output. I get the ease-of-use aspect but also recognize that sometimes being slightly harder to use is worthwhile if you eliminate ambiguities or limit the possibility to make mistakes. FWIW my background on this topic is more theoretical than experiential though I am an web-application developer by trade and do use some JSON in that capacity. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers- ow...@postgresql.org] On Behalf Of Hannu Krosing Sent: Friday, May 04, 2012 4:40 PM To: Robert Haas Cc: Tom Lane; Andrew Dunstan; PostgreSQL-development; Merlin Moncure Subject: Re: [HACKERS] JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ? On Fri, 2012-05-04 at 15:59 -0400, Robert Haas wrote: On Fri, May 4, 2012 at 3:49 PM, Hannu Krosing ha...@krosing.net wrote: On Fri, 2012-05-04 at 09:52 -0400, Tom Lane wrote: Hannu Krosing ha...@2ndquadrant.com writes: On Wed, 2012-05-02 at 12:06 -0700, Andrew Dunstan wrote: So given that do we do anything about this now, or wait till 9.3? I'd like the json support in 9.2 updated as follows I think it's too late to be entertaining proposals for such changes in 9.2. If we had concluded that the existing functions were actively wrong or a bad idea, I think that hard-coding postgresql text representation as our json representation without a possibility for the user tio easily fix it without rewriting foll xx_to_json() functions is borderline actively wrong. Can we at least have the xxx_to_json() functions try cast to json first and fall back to text if the cast fails. I think the idea that you can involve the casting machinery in this is misguided. sometextval::json has got to mean that sometextval is expected to be in the form of a syntactically correct JSON value - and NOT that we wrap it in a JSON string. We can have constructors for JSON, but they've got to be separate from the casting machinery. on the contrary - the string representation of textual value a is a casting should _not_ neam syntax check, casting is by definition a conversion. if we cast text to int, we return value of type int , if we cast int to numeric(5,2) we return value of type numeric(5,2) why should casring to json work differntly ? What is the distinction between what you are thinking regarding JSON and this example? SELECT '1a'::integer; SQL Error: ERROR: invalid input syntax for integer: 1a LINE 1: SELECT '1a'::integer As a user if I cast something to something else I want the result to be of the correct type and deterministic; otherwise throw me some kind of invalid input format exception (or syntax exception). Casting vs. Constructors is really a meaningless distinction to a lay person. When I cast I do so by constructing a new value using my existing value for input. When I use an explicit CAST I am unable to supply additional parameters to configure the casting whereas a constructor function gives me that possibility. But a constructor function without any additional parameters is not semantically different than a cast. I guess the concern to address is something like: SELECT '{key: value}'::json OR SELECT '[1.25]'::json; Do you interpret this as already being valid JSON and thus output object/array constructs (i.e., JSON Text) or do you treat them as string literals and output scalars (i.e., JSON Value). Even if you feel these are artificial constructs the concepts holds that there may be ambiguous data that can be interpreted in multiple ways (this applies even to function forms, though in the function form you could specify which one you want to output using a separate DEFAULTed parameter). I can see the primary use-case for JSON Value casting as being queries of the following forms (since the record and array forms are going through the record/array_to_json function): SELECT COUNT(*)::json FROM table [WHERE ...]; SELECT single_col::json FROM table WHERE id = ?; Where the single provided value can be sent directly back to the web-caller JavaScript and used as-is because it is valid JSON. Though, honestly, both SELECT to_json(single_col) and SELECT to_json(COUNT(*)) are equally usable so any distinction between them is a pure technical issue to me. Am I correct in assuming the following expected behavior (the forgive the blatantly wrong syntax but you should get the point)? RAISE NOTICE '%', SELECT 'A'::text = A RAISE NOTICE '%', SELECT 'A'::json = A David J -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible error in psql or Postgres?
On Jun 12, 2012, at 15:21, Dusan Misic promi...@gmail.com wrote: Is this normal Postgres / psql behavior? griffindb=# \d system.user; Table system.user Column | Type | Modifiers ---+---+ username | character varying(20) | not null password | character varying(32) | not null firstname | character varying(40) | not null default 'nema ime'::character vary ing lastname | character varying(40) | not null default 'nema prezime'::character varying Indexes: SystemUser_PK PRIMARY KEY, btree (username) CLUSTER normal query: griffindb=# select * from system.user where username = 'root'; username | password | firstname | lastname --+--+---+--- root | 1e7db545fccbf4e03abc6b71d329ab4f | Super | administrator (1 row) error query: griffindb=# select * from system.user where user = 'root'; username | password | firstname | lastname --+--+---+-- (0 rows) column user does not exist should throw an error! PostgreSQL / psql version: 9.1.3 on Windows 7 64-bit Should Postgres or psql report an error because column used in WHERE clause does not exist? http://www.postgresql.org/docs/9.0/interactive/functions-info.html user is actually a function the returns the current_user. It is an SQL special function and thus does not require the use of () after the function name. So basically you are saying where current_user = 'root' which is either a constant true or false for the statement. David J.
Re: [HACKERS] We probably need autovacuum_max_wraparound_workers
On Jun 27, 2012, at 22:00, Josh Berkus j...@agliodbs.com wrote: Folks, Yeah, I can't believe I'm calling for *yet another* configuration variable either. Suggested workaround fixes very welcome. The basic issue is that autovacuum_max_workers is set by most users based on autovac's fairly lightweight action most of the time: analyze, vacuuming pages not on the visibility list, etc. However, when XID wraparound kicks in, then autovac starts reading entire tables from disk ... and those tables may be very large. This becomes a downtime issue if you've set autovacuum_max_workers to, say, 5 and several large tables hit the wraparound threshold at the same time (as they tend to do if you're using the default settings). Then you have 5 autovacuum processes concurrently doing heavy IO and getting in each others' way. I've seen this at two sites now, and my conclusion is that a single autovacuum_max_workers isn't sufficient if to cover the case of wraparound vacuum. Nor can we just single-thread the wraparound vacuum (i.e. just one worker) since that would hurt users who have thousands of small tables. Would there be enough benefit to setting up separate small/medium?/large thresholds with user-changeable default table size boundaries so that you can configure 6 workers where 3 handle the small tables, 2 handle the medium tables, and 1 handles the large tables. Or alternatively a small worker consumes 1, medium 2, and large 3 'units' from whatever size pool has been defined. So you could have 6 small tables or two large tables in-progress simultaneously. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Issues with generate_series using integer boundaries
On Wed, Feb 9, 2011 at 4:50 AM, Thom Brown t...@linux.com wrote: On 9 February 2011 02:11, Robert Haas robertmh...@gmail.com wrote: On Tue, Feb 8, 2011 at 8:30 PM, Andrew Dunstan and...@dunslane.net wrote: Quite right, but the commitfest manager isn't meant to be a substitute for one. Bug fixes aren't subject to the same restrictions of feature changes. Another option would be to add this here: http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items I've removed it from the commitfest because it really doesn't belong there, and I've added it to the open items list. So, I finally got around to look at this, and I think there is a simpler solution. When an overflow occurs while calculating the next value, that just means that the value we're about to return is the last one that should be generated. So we just need to frob the context state so that the next call will decide we're done. There are any of number of ways to do that; I just picked what looked like the easiest one. Tangential comment but have you considered emitting a warning (and/or log entry) when you are 10,000-50,000 away from issuing the last available number in the sequence so that some recognition exists that any code depending on the sequence is going to fail soon? Also, during sequence creation you know the integer type being used so that maximum value is known and an overflow should not need to come into play (I guess the trade-off is the implicit try-catch [or whatever mechanism C uses] performance hit versus the need to store another full integer in the data structure). You could also give access to the warning threshold value so that the developer can change it to whatever value is desired (with a meaningful default of course). David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
Christopher Browne wrote: Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1, there's a pretty good reason NOT to support that, namely that this breaks relational handling of tables. PostgreSQL is a *relational* database system, hence it's preferable for structures to be relational, as opposed to hierarchical, which is what any of the suggested nestings are. Organizing the tables into a multi-level namespace, either fixed-depth or variable-depth, rather than using a flat namespace, does not make the database any less relational, because the above definition and any others still hold. The less relational argument above is a red herring or distraction. One can argue against namespace nesting just fine without saying that. -- Darren Duncan I agree with Darren. One thought that came to my mind was to use a different separator between two namespaces and/or between the database identifier and the rest of the path. Examples: ns1!ns2.table OR database@ns1.table OR database@ns1!ns2.table I've been following only some of the discussion but it seems that much ambiguity would be lost by using different separators. Schemas themselves are already non-standard so it isn't like we are constrained here in what is chosen. Just some quick thoughts I've had but haven't fully considered how they would fit in to the existing setup. But is there is any major reason why choosing different separators would not work? Also, within search_path, some form of wild-card selector would be desirable: ns1!*. I'm not opposed to having to be explicit about the search_path in order to avoid name collisions; though it would be nice if VIEWS had some kind of SET syntax, like functions do, so that the definer can specify the search_path that the view will resolve against. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug f...@phlo.org wrote: On Jul11, 2011, at 07:08 , Darren Duncan wrote: Christopher Browne wrote: Vis-a-vis the attempt to do nested naming, that is ns1.ns2.table1, there's a pretty good reason NOT to support that, namely that this breaks relational handling of tables. PostgreSQL is a *relational* database system, hence it's preferable for structures to be relational, as opposed to hierarchical, which is what any of the suggested nestings are. Rather, the argument is that it was intentional for the structuring of table naming to, itself, be relational, and changing that definitely has some undesirable characteristics. The need for recursive queries is the most obvious undesirable, but it's not the only undesirable thing, by any means. I do not see how recursive queries (really iteration of records) even enters the picture... Right now I can emulate a hierarchical schema structure via a naming scheme - for example schemabase_sub1_sub2_etc. I am simply looking for a formal way to do the above AND also tell the system that I want all schemas under schemabase to be in the search path. Heck, I guess just allowing for simply pattern matching in search_path would be useful in this case regardless of the presence of an actual schema hierarchy. Using LIKE syntax say: SET search_path TO schemabase_sub1_% or something similar. The only missing ability becomes a way for graphical tools to represent the schema hierarchy using a tree-structure with multiple depths. I can see how adding . and .. and relative paths would confuse the issue those are not necessary features of a multi-level schema depth. The above, combined with a different separator for intra-level namespace/schema delineation, would allow for an unambiguous way to define and use a hierarchical schema with seemingly minimal invasion into the current way of doing things. You could almost implement it just by requiring a specific character to act as the separator and then construct the actual schema using single-level literals and supporting functions that can convert them into an hierarchy. In other words, the schema table would still only contain one field with the full parent!child as opposed to (schema, parent) with (VALUES('parent',null),('child','parent')). In other words, if we use ! as the separator, any schema named parent!child could be stored and referenced as such but then if you run a getChildren(parent) function it would return child along with any other schemas of the form parent!%. In this case the % sign could maybe only match everything except ! and the * symbol could be used to match ! as well. I could give more examples but I hope the basic idea is obvious. The main thing is that the namespace hierarchy usage is standardized in such a way that pgAdmin and other GUI tools can reliably use for display purposes and that search_path can be constructed in a more compact format so that every schema and sub-schema is still absolutely referenced (you can even have the SET command resolve search_path at execution time and then remain static just like CREATE VIEW SELECT * FROM table. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsigned integer types
Maciej Gajewski wrote I'm also afraid that with the extension I'd be left on my own maintaining it forever. While if this could go into the core product, it would live forever. Clarification from the gallery: are we talking an extension or a custom PostgreSQL build/fork? If it is an extension the stick it up on GitHub and let whomever finds it valuable help contribute to keeping it relevant. No use letting perfection stand in the way of usability. If the current solutions are too slow then exploring the extension aspect - even if it falls short - is worthwhile. At minimum you learn from the experience and maybe someone else (or even yourself) can build on that foundation. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Unsigned-integer-types-tp5756994p5757234.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bad error message on valuntil
Tom Lane-2 wrote Joshua D. Drake lt; jd@ gt; writes: I had a customer pulling their hair out today because they couldn't login to their system. The error was consistently: 2013-06-07 08:42:44 MST postgres 10.1.11.67 27440 FATAL: password authentication failed for user user However the problem had nothing to do with password authentication. It was because the valuntil on the user had been set till a date in the past. Now technically if we just removed the word password from the error it would be accurate but it seems it would be better to say, FATAL: the user user has expired. I think it's intentional that we don't tell the *client* that level of detail. I could see emitting a log message about it, but it's not clear whether that will help an unsophisticated user. regards, tom lane I presume that password in this context refers to the method by which identity is checked; some alternatives being trust and ident? Using the same logic of why you would not expose the fact that the user is expired versus the user has provided invalid credentials exposing password is a security leak as well. And then, to top it off, provides a red herring to the user trying to figure out why their username/password combination isn't working. Something like: 'Authentication for user user failed. Update and try again or contact the administrator to confirm user is authorized to log onto the system.' David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Bad-error-message-on-valuntil-tp5758369p5758383.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bad error message on valuntil
Joshua D. Drake wrote On 06/07/2013 12:31 PM, Tom Lane wrote: Joshua D. Drake lt; jd@ gt; writes: On 06/07/2013 11:57 AM, Tom Lane wrote: I think it's intentional that we don't tell the *client* that level of detail. Why? That seems rather silly. The general policy on authentication failure reports is that we don't tell the client anything it doesn't know already about what the auth method is. We can log additional info into the postmaster log if it seems useful to do so, but the more you tell a client, the more you risk undesirable info leakage to a bad guy. As an example here, reporting the valuntil condition would be acking to an attacker that he had the right password. So security by obscurity? Alright, without getting into that argument how about we change the error message to: FATAL: Authentication failed: Check server log for specifics And then we make sure we log proper info? Sincerely, Joshua D. Drake regards, tom lane In a password login situation you should not indicate to the client why the login attempt failed. If you say that the password expired they know the username supplied has to be correct (otherwise how would you know the password is expired). However, echoing back the supplied user identifier (without otherwise implying that it exists or does not exist on the server) provides a quick verification spot for the user to see whether the expected user name was being sent - especially since the location of the error message is probably significantly removed from the location of the user name string on the client. Please check server log for specifics is not a good message for something sent to a client that in many normal situation would have no access to said logs. I'd suggest: Authentication Failed: the user (role_name) password combination was not found or is expired. How a particular user is to go about resolving the issue is an organizational (and individual) policy best ignored in the error message. For a stressed-out, administrator-capable, user who sees this message they at least are reminded that even if the combination exists it is possible that it is has somehow been disabled. Hopefully they will then remember that password expiration is possible and will check that along with the presence of the role/user. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Bad-error-message-on-valuntil-tp5758369p5758398.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?
Cédric Villemain-2 wrote And also, first_value is a *window* function, not a simple aggregate function... Per the documentation any aggregate function can be used with a WINDOW declaration. The logical question is why are window aggregates special so that the reverse cannot be true? In other words why is not every function simply defined as a normal aggregate that can be used in both contexts? See this example: # create table foo (i int, t timestamptz); # insert into foo select n, now() from generate_series(1,10) g(n); # select i, first_value(i) over (order by t desc) from foo; # select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING and UNBOUNDED FOLLOWING) from foo; What do you expect SELECT first(val order by ts desc) to output ? Undefined due to incorrect specificity of the ORDER BY definition. The window version has the same issue. The window aggregates should simply treat the entire input set as the relevant frame - basically the same output as would result from (simplistically): SELECT window_agg(...) FROM ( SELECT id, window_agg(...) OVER (ORDER BY id ASC) ORDER BY id ASC ) agg ORDER BY id DESC LIMIT 1 Admittedly this really only makes sense for first_value, last_value, and nth_value; the other window aggregates can return valid values but to have meaning they really need to be output in a windowing context. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Why-can-t-I-use-windowing-functions-over-ordered-aggregates-tp5760233p5760358.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Kudos for Reviewers -- straw poll
Brendan Jurd wrote On 26 June 2013 03:17, Josh Berkus lt; josh@ gt; wrote: How should reviewers get credited in the release notes? a) not at all b) in a single block titled Reviewers for this version at the bottom. c) on the patch they reviewed, for each patch I think some consideration toward a commit and review summary (outside the release notes; and graphical/interactive in nature ideally) for each major release is something worth considering. With regards to the release notes I'd lean toward (b); significant contributions getting bumped to co-author on specific patches covers (c) fairly well. I am unsure whether release note mentions are significant enough motivation...see other thoughts below. Should there be a criteria for a creditable review? a) no, all reviews are worthwhile b) yes, they have to do more than it compiles c) yes, only code reviews should count Ideally (a) though (b) conceptually makes sense but it is too generic. Should reviewers for 9.4 get a prize, such as a t-shirt, as a promotion to increase the number of non-submitter reviewers? a) yes b) no c) yes, but submitters and committers should get it too One low-cost prize that I've pondered is, on an ongoing basis, the ability to post a link and/or message to the PostgreSQL front page within a significantly less stringent barrier to acceptance than is required for current content. Basically except for topics or presentations deemed of poor taste or detrimental to the project anything should be allowed. Some kind of this message was allowed because so-and-so has recently made the following significant contributions to the project. There are probably quite a few logistics to deal with down this path but a sponsor platform for shameless self-promotion for people making the project successful - something visible on an ongoing basis and not just once a year in a release note - is likely a very valuable to the contributor while fairly inexpensive to the project (i.e., some risk of reputation and some cost to setup the infrastructure). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Kudos-for-Reviewers-straw-poll-tp5760952p5761031.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] column b is of type X but expression is of type text
Benedikt Grundmann wrote A third party application we use generates SQL queries. Here is query it generated that broke today and for which I have a hard time arguing that the postgres behavior is correct (minimally the error message is confusing): =# create temporary table foo (b double precision ); CREATE TABLE Time: 40.368 ms =# insert into foo select min(NULL); ERROR: column b is of type double precision but expression is of type text LINE 1: insert into foo select min(NULL); ^ HINT: You will need to rewrite or cast the expression. So why does min(NULL) have type text? According to the docs it has the type of the input. The value is itself NULL which is a valid member of all types in SQL isn't it? So what is going on? Thanks, Bene Ideally PostgreSQL would be smart enough to recognize that min(NULL) is of an unknown type and thus would use the definition of foo to coerce NULL to the desired type. I cannot explain why it does not do this but from the example it cannot. Using a literal NULL without an explicit type-cast is not recommended as the system cannot always accurately figure out what type you mean for it to use. Being a valid value for all types does not mean it magically switches to fit whatever usage is required. Columns are typed, not values per-se, and so NULL can belong in any column but once it is part of that column it takes on that column's type. The query you show is pretty pointless since the intent of min is to take a column over which to aggregate; not a literal which will only ever return itself. In short the SELECT query is trying its best to execute and so in the presence of an unadorned NULL - and being unable to infer the type from context - it simply uses the default type which is text. The SELECT executes just fine, and outputs a min column of type text which when supplied to the table foo causes the type mis-match for column b on foo. The PostgreSQL behavior is simple because it does not infer the type of NULL from the column in foo but it is not required to do so its failure is not wrong. The error message, given what does occur, makes perfect sense and is easy enough to trace (i.e., what column is feeding foo.b from the SELECT statement; then, why is that column being seen as text). PostgreSQL is in the opinion of some too verbose in its requirement to be explicit regarding types but it does make for less buggy code overall. This particular use-case may be solvable but I'd argue that your example is not likely to convince anyone that it is a serious enough problem worth the effort it would take to do so. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763587.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] column b is of type X but expression is of type text
Josh Berkus wrote On 07/12/2013 07:28 AM, Benedikt Grundmann wrote: Thanks David, I like the fact that postgres is explicit in it's types. All I'm arguing is that error message is misleading. And that I had a hard time understanding why happened what happened. The part I was missing is that despite supporting an any type the necessary type inference is very very local and quickly resorts to the default type. 2. you call min(). Min() works for many datatypes. Min() says: can I work for text? The answer is yes, so at this point the NULL which was default text becomes *really* text. . . . The alternative would be to disallow unadorned NULLs entirely, which would break thousands of applications. In the absence of the function call the system is able to delay resolving the type until later in the query: SELECT * FROM (VALUES ('2013-02-01'::date), ('2013-01-01'), (NULL) ) vals (col1); --works SELECT * FROM (VALUES ('2013-02-01'::date), ('2013-01-01'), (min(NULL)) ) vals (col1); --fails I have no idea how this mechanism works but ISTM that the planner could, for anyelement, look at where the result of the function call is used and add a cast to the function input value to match the desired result type if the input type is undefined. I'm curious what you would consider to be a more apropos error message in this situation; regardless of how difficult it would be to implement. I am also curious if you can think of a better example of where this behavior is problematic. The query for this thread is not something that I would deem to be good SQL. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763615.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A general Q about index
soroosh sardari wrote Hi I want to know how an index is created and used. actually if you can show to me a simple start point, it would be great. Regards, Soroosh Sardari In the documentation there is a table of contents and listed on that table is a section named Indexes. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/A-general-Q-about-index-tp5763912p5763926.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A general Q about index
David Johnston wrote soroosh sardari wrote Hi I want to know how an index is created and used. actually if you can show to me a simple start point, it would be great. Regards, Soroosh Sardari In the documentation there is a table of contents and listed on that table is a section named Indexes. David J. Since you posted this to hacker you may mean you wish to know how to program them as oppose to use them in SQL. I have no clue to this regard. If you did mean use in SQL then the documentation is your friend and you also should not have posted this question to -hackers but to -general instead; probably should have posted there anyway to begin with and been more clear as to what you mean by created and used. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/A-general-Q-about-index-tp5763912p5763928.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: How to configer the pg_hba record which the database name with \n ?
huxm wrote where there is a newline(\n) in the name. I can't imagine why you would want to use non-printing characters in a name, especially a database name. Even if the hba.conf file was able to interpret it (which it probably cannot but I do not know for certain) client interfaces are likely to have problems as well. Most of these would not think of interpolating a database identifier in that manner but instead treat the name as a literal value. Even when line-continuations are allowed they are often cosmetic in nature and the resultant newline is discarded during the pre-execution phase of the command interpreter. Arguably having a check constraint on the catalog to prohibit such a name would be more useful than trying to make such a construct functional. I'd guess in the immediate term the users accessing this database would need to have all as their target and then you use role-based authorization to limit which specific databases are accessible. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-configer-the-pg-hba-record-which-the-database-name-with-n-tp5765847p5765889.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Minor request: could someone enlighten me as to why making the directory location a compile-time option is undesirable. Packagers then can setup whatever structure they desire when they compile their distributions. In which case the discussion becomes what is a reasonable default and that can be made with respect of other defaults that are in place for people that would self-compile. I get the supporting users - telling them where to go to find these files aspect but I believe that ship has already sailed. The goal should be to make it as easy as possible to allow distributions and/or individual users to integrate PostgreSQL into their normal routine as possible. It isn't like we are adding unneeded complexity since it is obvious from the discussion that where files/directories are placed in the file system is a major variable. Enforcing $PGDATA when we know Debian is going to be upset doesn't seem to be that great an idea - it isn't like we are going to suddenly make them realize they have been doing things incorrectly all this time. I am not familiar with all of the configurations but I do recall that the location of postgres.conf and related files is already distribution specific so why shouldn't these extensions be as well? Sorry if this was discussed previously; I'll go look deeper in the thread if someone confirms that indeed it is. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Proposal-for-Allow-postgresql-conf-values-to-be-changed-via-SQL-tp5729917p5765892.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Andres Freund-3 wrote Even trying to do this completely will guarantee that this patch will never, ever, suceed. There simply is no way to reliably detect problems that have complex interactions with the rest of the system. We can improve the detection rate of problems after some real world experience. Don't make this unneccesarily complex. Instead of prevention some thought to recovery should be considered then. How about some form of persistence mechanism so that, before making these kinds of changes, the admin can save the current configuration. Then, in a worse case-scenario, they could run something like pg_ctl --restore-persisted-configuration ... to reset everything back the last known good configuration. A single-version save-restore routine for the configuration. When restoring you would want to keep the current/non-working configuration and associated logging information - maybe archived somewhere along with the a copy of the last known working version. This would provide some level of audit capability as well as a convenient way for someone to take that archive and send it off to someone more knowledgeable for assistance. Having it auto-run at boot time - possibly to a different archive area than when run manually - would be possible as well; so you'd have both the last good boot configuration as well as whatever point-in-time configurations you wish to save. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Proposal-for-Allow-postgresql-conf-values-to-be-changed-via-SQL-tp5729917p5765968.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Josh Berkus wrote (this discussion concerns issue (D), file-per-setting vs. one-big-file) So the case of two sessions both modifying ALTER SYSTEM SET, and one succeeding for some-but-all-GUCS, and the other succeeding for some-but-not-all-GUCs, would not be user-friendly or pretty, even if each setting change succeeded or failed atomically. Can the final file write occur only at COMMIT; with anything inside a transaction simply staged up for later saving (or rollback). The file write phase as a whole then needs to be atomic and not just a single GUC-file. Could the system read the last update timestamp of each GUC-file when the original statement is executed and then re-read all of them at commit and fail with some kind of serialization error if the last-update timestamp on any of the files has changed? I dislike the idea of any kind of automatic reload. That said some kind of have their been any configuration changes since last reload? query/function makes sense. In can be plugged into Nagios or similar to warn if these changes are occurring but made live. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Proposal-for-Allow-postgresql-conf-values-to-be-changed-via-SQL-tp5729917p5766338.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Doc Patch: Subquery section to say that subqueries can't modify data
Instead of simply expanding the section on sub-queries, which may still be worthwhile, it seems that we have effectively introduced a new kind of query - namely one that mixes both query DDL and update DDL into a kind of hybrid query. An entire section describing the means to implement these queries and the limitations thereof would seem advisable as the current material is spread throughout the documentation. Some areas to address would: Select queries that cause/utilize: function-based modifications CTE-based modifications FDW/dblink-based modifications I guess the main question is if someone were to put this together would it likely be included in the queries section of the documentation. Also, are there any other thoughts to add; and is something like this documented in a ToDo somewhere already? The proposed patch; while warranting a technical review (namely that the presence of functions in a sub-select can cause the sub-query to update the database), seems to add one more place to go find this information without adding a central index or summary that someone learning the system could directly comprehend/learn as opposed to it being some allowed/disallowed side-effect to something else. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Doc-Patch-Subquery-section-to-say-that-subqueries-can-t-modify-data-tp5766574p5766580.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Tom Lane-2 wrote Pavel Stehule lt; pavel.stehule@ gt; writes: I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. In the example case the problem is that ORDER BY constant is a valid, if not-very-useful, construct. Can we warn on this specific usage and thus mitigate many of the potential avenues of mis-use? If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Variadic-aggregates-vs-project-policy-tp5768980p5769106.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Variadic aggregates vs. project policy
Andres Freund-3 wrote On 2013-08-30 06:34:47 -0700, David Johnston wrote: Tom Lane-2 wrote I was one who sent a bug report - this error is not too dangerous, but it is hidden, and difficult to find, if you don't know what can be happen. Same as bug with plpgsql and SQL identifier collisions. If you understand, then you can protect self well and simply. If not, then it is a magic error. So still I am thing so best solution is a) a warning when detect ORDER BY in variadic aggregates Such a warning would never be tolerated by users, because it would appear even when the query is perfectly correct. b) disallow ORDER BY in variadic aggregates in classic syntax, and enable it only in WITHIN GROUP syntax where is safe , And we're *not* inventing randomly different syntax for variadic aggregates. That ship sailed when we did it this way for regular functions. In the example case the problem is that ORDER BY constant is a valid, if not-very-useful, construct. Can we warn on this specific usage and thus mitigate many of the potential avenues of mis-use? That doesn't help against something like »SELECT string_agg(somecol ORDER BY bar, separator)« where separator is a column. If we alter syntax for mitigation purposes I'd want to consider requiring parentheses around the columns that belong to the ORDER BY instead of using the full extended syntax of WITHIN GROUP. I think that ship has sailed. The syntax is there and it's not going away. Requiring different syntaxes for variadic/nonvariadic usages is going to be a way much bigger pitfall for users. Neither suggestion (nor any suggestion I would imagine) is going to solve the problem. The goal is to minimize the size of the exposure. For the second ORDER BY (col1, col2) suggestion it would be added and recommended so those using that syntax would have less to worry about. This would apply to ALL invocations, not just variadic. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Variadic-aggregates-vs-project-policy-tp5768980p5769119.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ENABLE/DISABLE CONSTRAINT NAME
Jeff Davis-8 wrote Is there any semantic difference between marking a constraint as DISABLED and simply dropping it? Or does it just make it easier to re-add it later? I cannot answer the question but if there is none then the main concern I'd have is capturing meta-information about WHY such a constraint has been disabled instead of dropped. I guess this whole feature extends from the trigger disable feature that already exists. Given we have the one adding this seems symmetrical... I cannot really see using either feature on a production system (if following best practices) but I can imagine where they could both be helpful during development. Note with this usage pattern the meta-information about why becomes considerably less important. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/ENABLE-DISABLE-CONSTRAINT-NAME-tp5769136p5769337.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.3 RC1 psql encoding reporting inconsistently?
Tom Lane-2 wrote Michael Nolan lt; htfoot@ gt; writes: This is 9.3 RC1 on a Fedora 7 system. Why does \l report the encoding as SQL_ASCII and \set report it as UTF8? psql sets client_encoding based on its environment (LANG or related variables). That's been true for some time --- since 9.1, according to a quick check. regards, tom lane My knowledge of encoding is minimal but to expand on the comment: Client and server (or, more specifically, database) encodings can and often do differ just as you are seeing here. I'm guessing that somewhere deep inside psql and/or postgres encoding conversion is performed if the client and server do not match. While I guess it is possible to try and auto-adapt the client encoding to match the server/database the current policy is to require the user to explicitly (so to speak) declare the encoding they are using on their client. I guess a counter-question would be: what would you expect \set to report and why? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/9-3-RC1-psql-encoding-reporting-inconsistently-tp5769334p5769339.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question regarding Sync message and unnamed portal
Tatsuo Ishii-4 wrote From these statements, I would think #4 will fail in the following sequence of commands because #3 closes transaction and it destroys unnamed portal: 1)Parse/Bind creates unnamed portal, 2)Parse/Bind/Execute creates named portal and executes, 3)Send Sync message (because it is required in extended protocol), 4)Execute unnamed portal created in #1. If this is true, that means unnamed portal execution and named portal execution cannot be mixed unless they are inside an explicit transaction. IMO this should be described in the document. My relatively uninformed conclusions from reading this thread and documentation excerpts: Correct. In the absence of an explicit transaction only a single logical query can be executed per transaction. #2 should be disallowed, though, as opposed to #4 failing since the second PARSE implies a second logical query being formed which should not occur outside an explicit transaction. At completion of each series of extended-query messages, the frontend should issue a Sync message. I take this phrase to mean that a logical sequence is defined as: PARSE - (BIND - EXECUTE){1,} - SYNC If you are within an explicit transaction you are brought back to point where the next PARSE can be issued and thus a new statement - within the same transaction - can be issued. If you are not within an explicit transaction the SYNC effectively causes a COMMIT; to be issued on the back-end thus the unnamed statement and unnamed portal are both closed. If the resultant portal is named and defined with FOR HOLD the named portal will continue to exist while the named (or unnamed for that matter) statement will be closed. The documentation makes the assumption that you understand the single statement nature of implicit transactions. Given the level of knowledge needed to operate with the frontend protocol this is not necessarily unreasonable. I guess my question is whether the backend fails in the situation of an implicit transaction and two PARSE messages without an intervening SYNC. If not then when is it reasonable to issue two PARSE statements in this manner? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Question-regarding-Sync-message-and-unnamed-portal-tp5726023p5770365.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question regarding Sync message and unnamed portal
Continuing my novice interpretation from before... Tatsuo Ishii-4 wrote It would be nice if something like unnamed portal will be destroyed by a Sync message if you are in an explicit transaction is in our manual. I do not believe this to be true from what I've scanned. Inside an implicit transaction it will be but in an explicit transaction the SYNC does not cause the transaction to close so any named and unnamed portals should still exist after the SYNC (or FLUSH) sequence is completed. This allows for cursors to be maintained over the entire lifetime of the explicit transaction. I do not follow where Tom's FLUSH comment is helpful here. In an explicit transaction their behavior should be semantically identical. In the implicit transaction case the use of FLUSH should not enable you to change the statement that is being executed inside the single-statement-transaction. In the example the error should come at #2 and you'd never get to #3 where you'd issue SYNC/FLUSH (or rather when you get to three you are already dead). SYNC vs. FLUSH in the documentation is confusing as to how they interact with implicit vs. explicit transactions. In the explicit transaction since SYNC issues a FLUSH anyway, the only difference seems to be the presence of the RFQ message. How does this (SQL) sequence work - knowing each COMMAND is a sequence of PARSE-BIND-EXECUTE extended commands that either suceeds or fails: 1) BEGIN; 2) SUCCESSFUL COMMAND; 3) SYNC; 4) ERROR COMMAND; 5) SUCCESSFUL (in theory) COMMAND; 6) SYNC; 7) SUCCESSFUL (in theory) COMMAND; 8) COMMIT; The second SYNC resets and returns to normal message processing but the transaction as a whole is in a failure mode. The interesting items are 5 and 7. So if we are not in an explicit transaction we cannot reuse unnamed portal because Sync closes the transaction Correct. In an implicit transaction all un-held portals are closed upon SYNC - named or otherwise. Note that the use of not explicit instead of implicit makes reading this discussion a little more difficult. However, because of: An unnamed prepared statement lasts only until the next Parse statement specifying the unnamed statement as destination is issued. the unnamed_statement should still exist. I can see this being useful if you are inserting 10,000 records via the extended protocol and want to save every 1,000. After SYNCing the first 1,000 BIND/EXECUTES you can continue using BINDing the original unnamed_statement to execute the next 1,000 and so forth. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Question-regarding-Sync-message-and-unnamed-portal-tp5726023p5770370.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pending query cancel defeats SIGQUIT
Noah Misch-2 wrote The errfinish() pertaining to that WARNING issues CHECK_FOR_INTERRUPTS(), and the query cancel pending since before the SIGQUIT arrived then takes effect. This is less bad on 9.4, because the postmaster will SIGKILL the backend after 5s. On older releases, the backend persists indefinitely. 9.4 == head or is this is typo? Your feelings on how far to back-patch? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Pending-query-cancel-defeats-SIGQUIT-tp5770390p5770394.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Weaker shmem interlock w/o postmaster.pid
Noah Misch-2 wrote I'm thinking to preserve postmaster.pid at immediate shutdown in all released versions, but I'm less sure about back-patching a change to make PGSharedMemoryCreate() pickier. On the one hand, allowing startup to proceed with backends still active in the same data directory is a corruption hazard. The corruption risk, imv anyway, is sufficient to backpatch the change and overrides the concerns around very fast shutdown/restarts. Making PGSharedMemoryCreate() pickier in all branches will greatly diminish the marginal value of preserving postmaster.pid, so I'm fine with dropping the postmaster.pid side of the proposal. Its probably still worth a fresh look at the immediate shutdown process to see whether the current location where postmaster.pid is removed is acceptable. It may not be necessary to leave it in place always but: 1) if there is a section of shared memory that can only be reached/found if one knows the pid, and 2) postmaster.pid is removed before that area is secured from future clobbering then there may be a risk that can still be mitigated by moving its removal without having to go to the extreme. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Weaker-shmem-interlock-w-o-postmaster-pid-tp5770399p5770559.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Questions about checksum feature in 9.3
Ants Aasma-2 wrote So, has anyone compiled checksum vectorized on OS X? Are there any performance data that would indicate whether or not I should worry with this in the first place? Even without vectorization the worst case performance hit is about 20%. This is for a workload that is fully bottlenecked on swapping pages in between shared buffers and OS cache. In real world cases it's hard to imagine it having any measurable effect. A single core can checksum several gigabytes per second of I/O without vectorization, and about 30GB/s with vectorization. Thoughts on how/where to provide guidance as to this kind of concern. The single paragraph in the initdb documentation seems to be lacking. Would a destination page on the wiki, linked to from the documentation, where current knowledge regarding benchmarks and caveats can be stored, be appropriate. To that end, Ants, do you actually have some resources and/or benchmarks which support your claim and that you can provide links to? The single core aspect is interesting. Does the implementation have a dedicated core to perform these calculations or must the same thread that handles the relevant query perform this work as well? How much additional impact/overhead does having to multitask have on the maximum throughput of a single core in processing checksums? This whole vectorization angle also doesn't seem to be in the documentation...though I didn't look super hard. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Questions-about-checksum-feature-in-9-3-tp5770936p5771100.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not In Foreign Key Constraint
Misa Simic wrote I guess that rule can be achieved with triigers on TableA and TableC - but the same is true for FK (and FK constraint is more effective then trigger - that is why I wonder would it be useful/achievable to create that kind of constraint) Thoughts, ideas? You create a common keys in use table and only insert a record into the main tables if you can successfully add the desired key to the shared keys table ( as a unique value ). Setup a normal FK to that table to help enforce that valid records must exist on the keys table. Not fool-proof but you only need to worry about insertions - delete from the pk table to remove the record from the main table and free up the key. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Not-In-Foreign-Key-Constraint-tp5771056p5771546.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not In Foreign Key Constraint
Misa Simic wrote Hi hackers, I just wonder how hard would be to implement something like Not In FK Constraint or opposite to FK... A more useful couple next sentences would be along the lines of: I have this problemI've approached it by doingbut it seems that an actual database enforced constraint would be a better solution. Is that something that has been considered? Are their other ways of attacking this problem I have not considered? You took quite a bit of time to try and start a discussion, and I get that you don't necessarily know where it is going to lead, but Not In FK constraint, with a descriptive sentence of two, likely would have been enough to get the ball rolling. Instead you devoted more space to technical clarification that would have been better served by espousing on what problem and how current approaches to dealing with said problem are limited. A more specific end-question would also help solicit better responses. I say all this because 3 days later nothing more substantial than why is this feature necessary has been put forth. The general idea likely has some merit but you've not provided anything for people to hook their teeth into. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Not-In-Foreign-Key-Constraint-tp5771056p5771651.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation for SET var_name FROM CURRENT
Amit Kapila-2 wrote While reading documentation for SET command, I observed that FROM CURRENT syntax and its description is missing from SET command's syntax page (http://www.postgresql.org/docs/devel/static/sql-set.html). Do you think that documentation should be updated for the same or is there any reason why it is not documented? It is documented as part of CREATE FUNCTION since its use is only valid in that context. The paragraph with the link to CREATE FUNCTION seems sufficient to notify and direct people to the needed description for this. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Documentation-for-SET-var-name-FROM-CURRENT-tp5772920p5772922.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation for SET var_name FROM CURRENT
Amit Kapila-2 wrote On Tue, Oct 1, 2013 at 10:25 AM, David Johnston lt; polobo@ gt; wrote: Amit Kapila-2 wrote While reading documentation for SET command, I observed that FROM CURRENT syntax and its description is missing from SET command's syntax page (http://www.postgresql.org/docs/devel/static/sql-set.html). Do you think that documentation should be updated for the same or is there any reason why it is not documented? It is documented as part of CREATE FUNCTION since its use is only valid in that context. Not only as part of CREATE FUNCTION, but as part of ALTER DATABASE, ALTER ROLE, ALTER FUNCTION syntax as well. In all these places other syntax of SET is also used and described. I think you are right that syntax SET .. FROM CURRENT is mainly used in context with few other SQL statements, but as it is part of SET command, so isn't it better to mention the syntax on SET page and may be explain a bit about its usage? The paragraph with the link to CREATE FUNCTION seems sufficient to notify and direct people to the needed description for this. After explaining the usage in short, may be can provide links to all other statements where it can be used, but I think syntax SET .. FROM CURRENT should be there with SET command's other syntax. FROM CURRENT is only valid as part of the SET sub-command attached to CREATE FUNCTION. Yes, a number of SQL commands take a sub-command called SET. These are not the same as the Top-level SQL SET command and have their own rules and syntax defined on the parent command's page. They share a key word to make the grammar and usage saner but these are semantically different statements. A paragraph cross-referencing where SET sub-commands exist has merit but since the main SET command does not accept FROM CURRENT it (FC) should not be included in its page directly. If you want to put forth an actual documentation change and get a concrete opinion then by all means. If you want someone else to do it I'm detailing why that is unlikely to happen. The link to section 18.1 from the SET command documentation covers most of the other relevant info about how to go about configuring the system. The SQL command reference is generally a much more narrow scope focusing on the syntax of the specific command listed. The current documentation for SET conforms to this reality. We both know how the settings sub-system works. I don't see many novice questions, though my only exposure is the mailing list, about this kind of thing so the documentation seems effective. Other than supposed completeness are there other reasons you feel a change regarding FROM CURRENT or SETtings in general need modification? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Documentation-for-SET-var-name-FROM-CURRENT-tp5772920p5772958.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation for SET var_name FROM CURRENT
David Johnston wrote A paragraph cross-referencing where SET sub-commands exist has merit but since the main SET command does not accept FROM CURRENT it (FC) should not be included in its page directly. It is strange that this actually does work - at least in 9.0 - given that SET ... FROM CURRENT does not seem to have any usage outside of its interaction as part of the CREATE FUNCTION command. Is there some use-case I am not seeing? Since the command works I would agree that it should be provided in the syntax section for SET and that a comment be added that says generally that its presence is an historical artifact and has no real use as part of the top-level command. Its intended use is in conjunction with the CREATE FUNCTION command. Alternative wordings to describe uses I am not seeing are good too. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Documentation-for-SET-var-name-FROM-CURRENT-tp5772920p5772977.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: dynamic shared memory: wherein I am punished for good intentions
Robert Haas wrote Unfortunately, the buildfarm isn't entirely happy with this decision. On buildfarm member anole (HP-UX B.11.31), allocation of dynamic shared memory fails with a Permission denied error, and on smew (Debian GNU/Linux 6.0), it fails with Function not implemented, which according to a forum post[1] I found probably indicates that /dev/shm doesn't mount a tmpfs on that box. What shall we do about this? I see a few options. Is this something that rightly falls into being a distro/package specific setting? If so then the first goal should be to ensure the maximum number of successful basic installation scenarios - namely someone installing PostgreSQL and connect to the running postgres database without encountering an error. As a default I would presume the current System V behavior is sufficient to accomplish this goal. If package maintainers can then guarantee that changing the default will improve the user experience they should be supported and encouraged to do so but if they are at all unsure they should leave the default in place. As long as a new user is able to get a running database on their machine if/when they run up against the low defaults of System V memory they will at least be able to focus on that single problem as opposed to having a failed initial install and being unsure exactly what they may have done wrong. Thus option # 2 seems sufficient. I do think that having some kind of shared-memory-manager utility could have value but I'd rather see that be a standalone utility as opposed to something magical done inside the bowels of the database. While probably harder to code and learn such a utility would provide for a much greater UX if implemented well. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/dynamic-shared-memory-wherein-I-am-punished-for-good-intentions-tp5774055p5774080.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Sameer Kumar wrote edb=# explain analyze select max(score) from student_score group by course; This query returns 6 records. The window one returns 123,000. Why do you expect these to have anywhere near the same performance or plan? You can enable/disable indexes/scans to see what alternatives plans may provide but nothing here stands out as being obviously incorrect. I'm not really clear on what your question is. Generally it sounds as if you are wondering if there are any plans to I prove the algorithms behind window function processing. Are you just looking at symptoms and thus possibly have unreasonable expectations or do you actually see an avenue for improvement in the engine? QUERY PLAN | Sort Method: external merge Disk: 7576kB Work memory; I/O is killing your performance on this query. It is more flexible but you pay a price for that. Another thing, (I may be stupid and naive here) does PostgreSQL re-uses the hash which has been already created for sort. In this case the inner query must have created a hash for windoing aggregate. Can't we use that same one while applying the the filter rn=1 ? Probably but others more knowledgable will need to answer authoritatively. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Using-indexes-for-ORDER-BY-and-PARTITION-BY-clause-in-windowing-functions-tp5775605p5775708.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Additional information on log_line_prefix
emanuel_calvo wrote %E = estimated rows How would you expect this to work? This information seems mostly useless without the context of a full EXPLAIN output. %T = temporal tables used I am guessing you mean temporary, not temporal - the later also being known as time oriented David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Additional-information-on-log-line-prefix-tp5775956p5775958.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How should row-security affects ON UPDATE RESTRICT / CASCADE ?
Tom Lane-2 wrote Craig Ringer lt; craig@ gt; writes: During my testing of Kohei KaiGai's row-security patches I've been looking into how foreign keys should be and are handled. There are some interesting wrinkles around FK cascades, the rights under which FK checks execute, and about the consistency effects of changing or applying an RLS policy. As I recall, I've been saying since day one that row-level security cannot sensibly coexist with foreign-key constraints, and I've been told that the potential users of such a feature don't care. I'm glad to see somebody else complaining. As far as the points you're making go, I think we must say that RLS checks are not applied during FK trigger queries, ie the FK triggers can always see everything even though they don't run as superuser. Is there some way to enforce that the PK and FK hosting tables have compatible RLS definitions? The examples that come to mind are: 1) both tables have RLS filters on at least one of the FK relationship columns so in a multi-tenant situation a given user is likely (hard to enforce perfectly) to be restricted to at least checking only the subset of rows in the PK belong to their tenant. 2) the PK table has no filter AND the FK table does not have an RLS filter on any of the columns being used in the FK. This covers shared lookup tables. I see no serious problem with DELETE FK-triggers but the ability to PK probe by inserting into a FK table does seem to need limitation. Of course the normal direct insert RLS checks will help (and maybe totally) to cover #1 above. The other question is whether such a hidden relationship constitutes a mis-configuration of RLS. This goes back to compatibility - is there some algorithm that can be applied to FK constraints and the associated tables that can measure compatibility and generate warnings when a constraint or RLS definition is added or changed on those tables? An error is probably to severe; especially at first. Lacking a use-case for when two incompatible tables need to have a FK-PK relationship I'm more inclined to force the application of RLS across the relationship constraint and consider these trigger errors to be symptoms of a mis-configuration of the RLS policy that need to be fixed by the DBA. In the presence of a mis-configured policy the ability to provide security guarantees is shot and the examples so far all prove that. Table B should have the PK record visible for corresponding visible FK records on table A otherwise there would have been no way to insert the table A initially which means there was a time when an (invalid) constraint was added that broke the relationship and at that point an error should have been raised. Hopefully this all sparks some thoughts from others much more familiar with RLS than I. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-should-row-security-affects-ON-UPDATE-RESTRICT-CASCADE-tp5776229p5776273.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Optimizer improvement
Jim Nasby-2 wrote Should that really matter in this case? ISTM we should always handle LIMIT before moving on to the SELECT clause…? SELECT generate_series(1,10) LIMIT 1 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Feature-request-Optimizer-improvement-tp5776589p5776707.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column order and physical column order
David Rowley wrote I'm sure in the real world there are many cases where a better choice in column ordering would save space and save processing times, but is this something that we want to leave up to our users? Right now there is little visibility, from probably 99% of people, that this is even something to be concerned with. I have no qualms with making a person run a routine to change the physical ordering of their tables - and if they really care about logical order in the output it is best to list the column names anyway - so the problem that is worth solving is providing a way for the system to tell the user, for a given table, what the most efficient physical order would be - ideally in the form of a CREATE TABLE AS statement - and let the user manually effect the change. So invent the algorithm to identify the best physical order and make it accessible to users for manual use. If the benefits seem great enough after its use for a couple of releases a more informed decision can be made as to whether to try and automate its application. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/logical-column-order-and-physical-column-order-tp5776770p5776784.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLE not synonymous with SELECT * FROM?
Colin 't Hart wrote Methinks we should fix the documentation, something like: The command TABLE name is equivalent to SELECT * FROM name It can be used as a top-level command or as a space-saving syntax variant in parts of complex queries. Only the WITH, ORDER BY, LIMIT, and Locking clauses and set operations can be used with TABLE; the WHERE and ORDER BY clauses and any form of aggregation cannot be used. The paragraph is unnecessary if the Synopsis section of the SELECT documentation is updated to correctly reflect all the valid clauses that can be attached to TABLE. The current reading implies that you cannot attach anything so when you said LIMIT worked I was surprised. Also, testing seems to confirm that the allowance of LIMIT implies that OFFSET is allowed as well. If TABLE is allowed as a top-level command why doesn't it get its own page in the SQL commands section? It really doesn't matter - and honestly while I've known about it I've never actually thought to use it in actual queries because as soon as you want to do something special you have to switch it out for SELECT * FROM anyway - but it does seem inconsistent. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/TABLE-not-synonymous-with-SELECT-FROM-tp5777695p533.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
Andrew Dunstan wrote A general ability to rename things would be good. In particular, restoring schema x into schema y or table x into table y would be very useful, especially if you need to be able to compare old with new. compare old and new what? I would imagine that schema comparisons would be much easier if the only thing that is different is the database name and you compare database old to database new. Are there any existing threads or posts, that you recollect, that detail solid use-cases for clone-and-rename mechanics? I don't seem to recall anything in the past year or so but my coverage is probably only about 70% in that timeframe. SQL seems particularly unfriendly to renaming and runtime name resolution in general (largely due to caching effects). Some kind of alias mechanism makes sense conceptually but the performance hit for such isn't likely to be worth incurring. I could see having table name aliases so that raw data in a dump from one database could be restored into another but I'd likely require that the user be able to generate the target schema from source themselves. That would facilitate the use-case where the DBA/programmer is able to fully recreate their schema from source and only require that actual data be restored into the newly created database. I can see where grants may fall into a grey middle-area but functions/view/triggers and the like would need to be synchronized with any schema naming changes and that should, IMO, be driven from source and not facilitated by a dump/restore process. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-pg-dumpall-in-real-life-tp518p5777816.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
Josh Berkus wrote Well, then we just need pg_restore to handle the role already exists error message gracefully. That's all. Or a CREATE ROLE IF NOT EXISTS statement, and use that for roles. My only qualm here is if the exists check is based off of role name only. If database A and database B came from different clusters but both have a role david the actual identity of david is (could be) different because the source cluster. The risk of such occurring is a high-security situation is likely to be small but some kind of --ignore-different-cluster-same-role flag may be worthwhile such that pg_restore will error unless that flag is set (i.e., high security by default). The error itself should be rare enough most people wouldn't even notice it is there but seeing such an error (with a hint provided as well) would be easily able to disable and continue on with the restore. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-pg-dumpall-in-real-life-tp518p5777823.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-lifespan memory leak with plpgsql DO blocks
Robert Haas wrote That's a sufficiently astonishing result that it wouldn't be surprising for this to get reported as a bug where a simple performance gap wouldn't be, and I think if we don't fix it the perception will be that we've left that bug unfixed. Now, there are lots of things we don't fix just because there is not an infinitely large army of trained PostgreSQL hackers who love to fix other people's bugs for free, so I'm not going to say we HAVE to fix this or whatever - but neither do I think fixing it is useless and worthless. Having had this same thought WRT the FOR UPDATE in LOOP bug posting the lack of a listing of outstanding bugs does leave some gaps. I would imagine people would appreciate something like: Frequency: Rare Severity: Low Fix Complexity: Moderate Work Around: Easy - create an actual function; create some form of loop Status: Confirmed - Awaiting Volunteers to Fix Even without a formal system it may not hurt for bug threads to have a posting with this kind of information summarizing the thread. As Tom is apt to do - for the sake of the archives - though mostly I see those once something has been fixed and not for items that are being left open. Ideally these could also be migrated to the wiki, with links back to the main thread, to provide a basic known open items interface - something that I imagine would make corporate acceptance of PostgreSQL more likely. I don't see where there are a considerably large number of these unresolved items - most things do indeed get fixed or explained away as normal user learning. Sorry for the digression but it seems relevant. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Transaction-lifespan-memory-leak-with-plpgsql-DO-blocks-tp5777942p5778001.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First patch -- somewhat trivial feature
Stephen Frost wrote * Robert Berry ( berrydigital@ ) wrote: This is my first attempt at writing a patch, so it's pretty simple. Neat! seq_page: 1.00, rnd_page: 4.00, cpu_tup: 0.01, cpu_ind: 0.005000, cpu_op: 0.002500 amenabled: 111 The bit vector is enable variables in the order listed in cost.h, though mainly provides a high level view on whether or not any strategies are disabled. While I like the general idea, I have to admit that I don't particularly like the format and I'm not sure why it makes sense to have this as part of 'explain'? Why not do a 'show all;' ahead of the explain? I kinda get the theory behind this but, WRT formatting, explain can output multiple formats and any patch affecting said output should provide for changing all of them. Having each of the sample outputs in the post would allow for comments from those who would not generally apply such patches. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/First-patch-somewhat-trivial-feature-tp5778245p5778250.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Hannu Krosing-5 wrote On 11/14/2013 08:17 PM, Merlin Moncure wrote: On Thu, Nov 14, 2013 at 11:34 AM, David E. Wheeler lt; david@ gt; wrote: On Nov 14, 2013, at 7:07 AM, Merlin Moncure lt; mmoncure@ gt; wrote: This is exactly what needs to be done, full stop (how about: hstore). It really comes down to this: changing the serialization behaviors that have been in production for 2 releases (three if you count the extension) is bad enough, but making impossible some legal json constructions which are currently possible is an unacceptable compatibility break. The current json format is a minimally conforming (i.e., does not enforce the should not contain duplicates suggestion) structured json validating type that stores its input as-is once validated. Its presence is going to probably cause difficulties with function API for reasons already mentioned but its place in core type-library is already firmly established. Andrew's API additions seem like good things to have for this type. I haven't seen any comments on this but do these functions facilitate creating json that can have duplicates and that maintain order? Even if we accept input to json with these limitations we are not obligated to make our own json output minimally conforming - though we should at maintain such if it is already in place. So my suggestion is to upgrade existing json data type to text - or maybe json_text with format check - when upgrading to 9.4, to change current function which output json to output text and have new json type which stores proper JavaScript Object - like structured data. Technically a down-grade but anyway... How does this work with a pg_dump/pg_restore upgrade? If we want to have maximally conforming json type(s) we can still create them. I'd say we'd still want two versions, similar in a way to how we have bytea and text even though any text can technically be stored like bytea. The constructor API for both would want to be identical with the only real difference being that text-json_source would be layout preserving (i.e., validation only) while text-json_binary would be a true parsing conversion. Likewise json_source-text would output the same input while json_binary-text would output the canonical form (pretty-printing and such would need to be initiated via functions). If things are going to be a little more complex anyway why not just go and toss in the kitchen sink too? This way we provide maximal flexibility. From a development perspective some features (indexes, equality, in-place updates and related modification API) may only make sense on a subset of the available types but trade-offs are a fact of life. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778406.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Merlin Moncure-2 wrote I don't want to have two types, but I think I'd probably rather have two clean types than this. I can't imagine it being remotely acceptable to have behaviour depend in whether or not something was ever stored, which is what this looks like. Well, maybe so. My main gripe with the 'two types' solutions is that: 1) current type is already in core (that is, not an extension). In hindsight, I think this was a huge mistake. 2) current type has grabbed the 'json' type name and the 'json_xxx' API. 3) current type is getting used all over the place 'Two types' means that (AIUI) you can't mess around with the existing API too much. And the new type (due out in 2016?) will be something of a second citizen. The ramifications of dealing with the bifurcation is what makes *my* head hurt. Every day the json stuff is getting more and more widely adopted. 9.4 isn't going to drop until 2014 best case and it won't be widely deployed in the enterprise until 2015 and beyond. So you're going to have a huge code base operating on the 'legacy' json type. merlin The current type can store the exact same data as what a hash-like type could store. It can also store stuff a hash-like type would not be able to store. From my reading the main reason for adding the new hash-like type would be to increase the performance characteristics of using said type. So: 1) if reasonable performance can be had with the current type the new type would be unnecessary 2) if #1 is not possible then the new type trades of leniency in format for performance improvements One implication of #2 is that existing json that wants the improved performance will need to undergo a full-table rewrite in order to be converted. Both output textual representations are identical and function overloading and API should be able to maintained substantially identical between the two types. David J -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778628.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Josh Berkus wrote On 11/15/2013 02:59 PM, Merlin Moncure wrote: On Fri, Nov 15, 2013 at 4:31 PM, Hannu Krosing lt; hannu@ gt; wrote: I think you may be on to something here. This might also be a way opt-in to fast(er) serialization (upthread it was noted this is unimportant; I'm skeptical). I deeply feel that two types is not the right path but I'm pretty sure that this can be finessed. As far as I understand merlin is mostly ok with stored json being normalised and the problem is just with constructing extended json (a.k.a. processing instructions) to be used as source for specialised parsers and renderers. Thing is, I'm not particularly concerned about *Merlin's* specific use case, which there are ways around. What I am concerned about is that we may have users who have years of data stored in JSON text fields which won't survive an upgrade to binary JSON, because we will stop allowing certain things (ordering, duplicate keys) which are currently allowed in those columns. At the very least, if we're going to have that kind of backwards compatibilty break we'll want to call the new version 10.0. That's why naming old JSON as json_text won't work; it'll be a hardened roadblock to upgrading. Agreed. I can't imagine a use-case that would warrant breaking the current behavior of json. Either we live with just one, text-oriented, json type and finesse whatever performance gains we can without breaking compatibility; or we introduce additional types (I personally like adding 2 instead of one but just adding the binary one would be ok) which - barring an overwhelming desire by -core to group-self-flagellate - means giving the new type an as yet unused name. From a marketing perspective having 3 types with the following properties is an easy message to sell: 1) json - liberal interpretation w/ validation only; stored as text; output as-is 2) json_text - strict interpretation w/ validation only; stored as text; output as-is 3) json_binary - strict interpretation w/ validation parsing; stored as binary; output normalized This way json seems less like a mistake but rather an intentional desire to introduce a liberal type that meets data exchange needs in the short term and now, later, a structured data storage mechanism similar to hstore. Even if you have json_binary I can imaging that some people would want to be able to store the original strict json as-is. Sure, they can use text, but this way intent is made clear and validation is attached directly to the type as opposed to having to be done separately. The use-cases described for needing a liberal json prove this out. That said json would be an acceptable replacement for json_text in many cases and separate validation for strict json prior to storing into json isn't that heinous. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778655.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Looking at this a different way: could we just implement BSON and leave json alone? http://bsonspec.org/ David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778656.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
David E. Wheeler-3 wrote I like JSONB because: 1. The B means binary 2. The B means second 3. It's short 4. See also BYTEA. json_strict : Not sure about the bytea reference off-hand... I was pondering jsons which meets the short property just fine and the trailing s would stand for strict which is the user-visible semantic that this type exhibits rather than some less-visible binary attribute which most users would not really care about. I dislike the implication of plural-ness that the s imparts, though. Implication of second doesn't seem that important since both types provide useful semantics. I can imagine where the short aspect will lead people to accidentally type json where they mean to use jsonb and having a just a single extra character will increase the likelihood they will not notice. Knowing about and having used json_strict previously it will be more probable that such users will noticeably feel something is missing if they drop the whole _strict suffix. So, I'll toss out json_strict for my bikeshed contribution. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778770.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Bruce Momjian wrote Considering we are doing this outside of a transaction, and WARNING or ERROR is pretty much the same, from a behavioral perspective. Should we change this and LOCK to be a warning? From the calling application's perspective an error and a warning are definitely behaviorally different. For this I'd vote for a warning (haven't pondered the LOCK scenario) as using SET out of context means the user has a fairly serious mis-understanding of the code path they have written (accedentially or otherwise). Notice makes sense (speaking generally and without much research here) for stuff where the ultimate outcome matches the statement but the statement itself didn't actually do anything. Auto-sequence and index generation fell into this but even notice was too noisy. In this case we'd expect that the no-op statement was issued in error and thus should be changed making a warning the level of incorrect-ness to communicate. A notice would be more appropriate if there were valid use-cases for the user doing this and we just want to make sure they are conscious of the unusualness of the situation. I dislike error for backward compatibility reasons. And saving the user from this kind of mistake doesn't warrant breaking what could be properly functioning code. Just because PostgreSQL isn't in a transaction does not mean the client is expecting the current code to work correctly - even if by accident - as part of a sequence of queries. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5778994.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Bruce Momjian wrote On Mon, Nov 18, 2013 at 05:05:45PM -0800, David Johnston wrote: Bruce Momjian wrote Considering we are doing this outside of a transaction, and WARNING or ERROR is pretty much the same, from a behavioral perspective. Should we change this and LOCK to be a warning? From the calling application's perspective an error and a warning are definitely behaviorally different. For this I'd vote for a warning (haven't pondered the LOCK scenario) as using SET out of context means the user has a fairly serious mis-understanding of the code path they have written (accedentially or otherwise). Notice makes sense (speaking generally and without much research here) for stuff where the ultimate outcome matches the statement but the statement itself didn't actually do anything. Auto-sequence and index generation fell into this but even notice was too noisy. In this case we'd expect that the no-op statement was issued in error and thus should be changed making a warning the level of incorrect-ness to communicate. A notice would be more appropriate if there were valid use-cases for the user doing this and we just want to make sure they are conscious of the unusualness of the situation. I dislike error for backward compatibility reasons. And saving the user from this kind of mistake doesn't warrant breaking what could be properly functioning code. Just because PostgreSQL isn't in a transaction does not mean the client is expecting the current code to work correctly - even if by accident - as part of a sequence of queries. Well, ERROR is what LOCK returns, so if we change SET TRANSACTION to be WARNING, we should change LOCK too, so on backward-compatibility grounds, ERROR makes more sense. Personally, I am fine with changing them all to WARNING. Error makes more sense if the goal is internal consistency. That goal should be subservient to backward compatibility. Changing LOCK to warning is less problematic since the likelihood of current code functioning in such a way that after upgrade it would begin working differently in the absence of an error does not seem probable. Basically someone would have be trapping on the error and conditionally branching their logic. That said, if this was a day 0 decision I'd likely raise an error. Weakening LOCK doesn't make sense since it is day 0 behavior. Document the warning for SET as being weaker than ideal because of backward compatibility and call it a day (i.e. leave LOCK at error). The documentation, not the code, then enforces the feeling that such usage is considered wrong without possibly breaking wrong but working code. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5779006.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Bruce Momjian wrote On Mon, Nov 18, 2013 at 06:30:32PM -0800, David Johnston wrote: Personally, I am fine with changing them all to WARNING. Error makes more sense if the goal is internal consistency. That goal should be subservient to backward compatibility. Changing LOCK to warning is less problematic since the likelihood of current code functioning in such a way that after upgrade it would begin working differently in the absence of an error does not seem probable. Basically someone would have be trapping on the error and conditionally branching their logic. That said, if this was a day 0 decision I'd likely raise an error. Weakening LOCK doesn't make sense since it is day 0 behavior. Document the warning for SET as being weaker than ideal because of backward compatibility and call it a day (i.e. leave LOCK at error). The documentation, not the code, then enforces the feeling that such usage is considered wrong without possibly breaking wrong but working code. We normally don't approach warts with documentation --- we usually just fix them and document them in the release notes. If we did, our docs would be a whole lot uglier. That is a fair point - though it may be that this instance needs to be one of those usually exceptions. For any sane use-case turning this into an error shouldn't cause any grief; and those cases where there is grief should be evaluated and changed anyway. I could honestly live with either change to SET TRANSACTION but regardless would leave LOCK as-is. The backward compatibility concern, while valid, does indeed seem weak and worth breaking in order to maintain a consistent ABI going forward. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5779028.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Robert Haas wrote On Mon, Nov 18, 2013 at 9:07 PM, Bruce Momjian lt; bruce@ gt; wrote: Well, ERROR is what LOCK returns, so if we change SET TRANSACTION to be WARNING, we should change LOCK too, so on backward-compatibility grounds, ERROR makes more sense. Personally, I am fine with changing them all to WARNING. I don't think it's worth breaking backward compatibility. I'm not entirely sure what I would have decided here in a vacuum, but at this point existing precedent seems determinative. Well, at this point we have already broken backward compatibility by releasing this. With Tom's thread necromancy I missed the fact this got released in 9.3 Now, given normal upgrade realities the people likely to have this bite them probably are a ways out from upgrading so I wouldn't expect to have seen many complaints yet - but at the same time I do not recall seeing any complaints yet (limited to -bugs and -general) The referenced patch: is released is documented is consistent with precedent established by similar codepaths causes an obvious error in what is considered broken code can be trivially corrected by a user willing and able to update their application I'd say leave this as-is and only re-evaluate the decision if complaints are brought forth. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5779170.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: Robert Haas wrote I don't think it's worth breaking backward compatibility. I'm not entirely sure what I would have decided here in a vacuum, but at this point existing precedent seems determinative. Well, at this point we have already broken backward compatibility by releasing this. With Tom's thread necromancy I missed the fact this got released in 9.3 Uh, what? The commit I'm objecting to is certainly not in 9.3. It's this one: Author: Bruce Momjian lt; bruce@ gt; Branch: master [a54141aeb] 2013-10-04 13:50:28 -0400 Issue error on SET outside transaction block in some cases Issue error for SET LOCAL/CONSTRAINTS/TRANSACTION outside a transaction block, as they have no effect. Per suggestion from Morten Hustveit I agree that it's too late to reconsider the behavior of pre-existing cases such as LOCK TABLE, but that doesn't mean I can't complain about this one. My bad, I was relaying an assertion without checking it myself. I believe my source meant 9.4/head and simply mis-typed 9.3 which I then copied. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5779205.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Andrew Dunstan wrote Given that, I'm not sure we shouldn't permit them in b) either. I think I lost that argument back in the 9.2 dev cycle. I really don't want to get to a situation where foo::json::jsonb can produce an error. So what do you propose happens when the input json has duplicate keys? IMO A reasonable default cast function should error if the json contents require anything more than a straight parse to be stored into jsonb. If the user still needs to make the conversion we should have a standard and configurable parser function with json input and jsonb output. In this case the key-keep options would be keep first encountered or keep last encountered or fail on duplicate the last of which would be the default. I have not really pondered storing scalars into jsonb but before pondering usability are there any technical concerns. If the goal is to share the backend with hstore then current hstore does not allow for this and so the json aspect would either transfer back over or it would need customized code. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5779221.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Gavin Flower-2 wrote More seriously, there are obviously variants in what people consider useful human readable form of JSON output, but it is probably inefficient to store white space. Enough to matter? Maybe the extra whitespace causes a marginal value to be toasted but, IIUC, for a value that is going to be toasted anyway the compression factors for both speed and space is going to make whitespace considerations insignificant. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5779227.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Hannu Krosing-3 wrote On 11/18/2013 06:49 PM, Josh Berkus wrote: On 11/18/2013 06:13 AM, Peter Eisentraut wrote: On 11/15/13, 6:15 PM, Josh Berkus wrote: Thing is, I'm not particularly concerned about *Merlin's* specific use case, which there are ways around. What I am concerned about is that we may have users who have years of data stored in JSON text fields which won't survive an upgrade to binary JSON, because we will stop allowing certain things (ordering, duplicate keys) which are currently allowed in those columns. At the very least, if we're going to have that kind of backwards compatibilty break we'll want to call the new version 10.0. We could do something like SQL/XML and specify the level of validity in a typmod, e.g., json(loose), json(strict), etc. Doesn't work; with XML, the underlying storage format didn't change. With JSONB, it will ... so changing the typemod would require a total rewrite of the table. That's a POLS violation if I ever saw one We do rewrites on typmod changes already. To me having json(string) and json(hstore) does not seem too bad. Three things: 1) How would this work in the face of functions that erase typemod information? 2) json [no type mod] would have to effectively default to json(string)? 3) how would #1 and #2 interact? I pondered the general idea but my (admittedly limited) gut feeling is that using typemod would possibly be technically untenable and from an end-user perspective would be even more confusing than having two types. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5779428.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH ORDINALITY versus column definition lists
Tom Lane-2 wrote It seems to me that we don't really want this behavior of the coldeflist not including the ordinality column. It's operating as designed, maybe, but it's unexpected and confusing. We could either 1. Reinsert HEAD's prohibition against directly combining WITH ORDINALITY with a coldeflist (with a better error message and a HINT suggesting that you can get what you want via the TABLE syntax). 2. Change the parser so that the coldeflist is considered to include the ordinality column, for consistency with the bare-alias case. We'd therefore insist that the last coldeflist item be declared as int8, and then probably have to strip it out internally. #2 but I am hoping to be able to make the definition of the column optional. One possibility is that if you do want to provide an alias you have to make it clear that the coldeflist item in question is only valid for a with ordinality column alias. Otherwise the entire coldeflist is used to alias the record-type output and the ordinality column is provided its default name. Two options I came up with: 1) disallow any type specifier on the last item: t(f1 int, f2 text, o1) 2) add a new pseudo-type, ord: t(f1 int, f2 text, o1 ord) I really like option #2. It makes it perfectly clear, entirely within the coldeflist SQL, that the last column is different and in this case optional both in the sense of providing an alias and also the user can drop the whole ordinality aspect of the call as well. The system does not need to be told, by the user, the actual type of the ordinality column. And given that I would supposed most people would think to use int or bigint before using int8 the usability there is improved once they need and then learn that to alias the ordinality column they use the ord type which would internally resolve to the necessary output type. Option one is somewhat simpler but the slight added verbosity makes reading the SQL coldeflist easier, IMO, since you are already scanning name-type pairs and recognizing the missing type is, for me, harder than reading off ord and recalling its meaning. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779449.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH ORDINALITY versus column definition lists
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: Tom Lane-2 wrote It seems to me that we don't really want this behavior of the coldeflist not including the ordinality column. It's operating as designed, maybe, but it's unexpected and confusing. We could either 1. Reinsert HEAD's prohibition against directly combining WITH ORDINALITY with a coldeflist (with a better error message and a HINT suggesting that you can get what you want via the TABLE syntax). 2. Change the parser so that the coldeflist is considered to include the ordinality column, for consistency with the bare-alias case. We'd therefore insist that the last coldeflist item be declared as int8, and then probably have to strip it out internally. Two options I came up with: 1) disallow any type specifier on the last item: t(f1 int, f2 text, o1) 2) add a new pseudo-type, ord: t(f1 int, f2 text, o1 ord) I really like option #2. I don't. Pseudo-types have a whole lot of baggage. #1 is a mess too. And in either case, making coldef list items optional increases the number of ways to make a mistake, if you accidentally omit some other column for instance. I'll have to trust on the baggage/mess conclusion but if you can distinctly and un-ambigiously identify the coldeflist item that is to be used for ordinality column aliasing then the mistakes related to the function-record-coldeflist are the same as now. There may be more (be still quite few I would think) ways for the user to make a mistake but the syntax ones are handled anyway and so if the others can be handled reasonably well the UI for the feature becomes more friendly. IOW, instead of adding int8 and ignoring it we poll the last item, conditionally discard it (like the int8 case), then handle the possibly modified structure as planned. Basically the problem here is that it's not immediately obvious whether the coldef list ought to include the ordinality column or not. The user would probably guess not (since the system knows what type ordinality should be). Yes, if the column is not made optional somehow then I dislike option #2 The TABLE syntax is really a vastly better solution for this. So I'm thinking my #1 is the best answer, assuming we can come up with a good error message. My first attempt would be ERROR: WITH ORDINALITY cannot be used with a column definition list HINT: Put the function's column definition list inside TABLE() syntax. Better ideas? Works for me if #1 is implemented. Just to clarify we are still allowing simple aliasing: select * from generate_series(1,2) with ordinality as t(f1,f2); Its only when the output of the function is record does the restriction of placing the record-returning function call into TABLE (if you want ordinals) come into play. select * from table(array_to_set(array['one', 'two']) as (f1 int,f2 text)) with ordinality as t(a1,a2,a3); If we could do away with having to re-specify the record-aliases in the outer layer (a1, a2) then I'd be more understanding but I'm thinking that is not possible unless you force a single-column alias definition attached to WITH ORDINALITY to mean alias the ordinality column only. On the plus side: anyone using record-returning functions is already dealing with considerable verbosity so this extra bit doesn't seem to be adding that much overhead; and since the alias - t(a1,a2,a3) - is optional if you don't care about aliasing the with ordinal column the default case is not that verbose (just add the surrounding TABLE). I feel like I need a flow-chart for #1... With #2 (w/ optional) you can add in an alias for the ordinality column anyplace you would be specifying a coldeflist OR alias list. Favoring the pseudo-type solution is the fact that given the prior sentence if you place o1 ord in the wrong place it is possible to generate an error like with ordinality not present for aliasing. #1 is simpler to implement and does not preclude #2 in the future. Possible #3? Not sure if this is possible at this point but really the alias for the ordinality column would be attached directly to the ordinality keyword. e.g., ...) with ordinality{alias} as t(a1, a2) David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779468.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH ORDINALITY versus column definition lists
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: Just to clarify we are still allowing simple aliasing: select * from generate_series(1,2) with ordinality as t(f1,f2); Right, that works (and is required by spec, I believe). It's what to do with our column-definition-list extension that's at issue. Not sure if this is possible at this point but really the alias for the ordinality column would be attached directly to the ordinality keyword. e.g., ...) with ordinality{alias} as t(a1, a2) This has no support in the standard. Now I'm just spinning some thoughts: ) with ordinality AS t(a1 text, a2 text | ord1) -- type-less, but a different separator ) with ordinality AS t(a1 text, a2 text)(ord1) -- stick it in its own section, type-less ) with ordinality AS t(a1 text, a2 text) ordinal(ord1) --name the section too would probably want to extend the alias syntax to match... Is there any precedent in other RDBMS to consider? I don't see any obvious alternatives to the ones you listed and syntax is really not a huge barrier. If the implementation of an optionally specified alias is a barrier then either someone needs to feel strongly enough to implement it or just default to #1 for the time being. But others really haven't had a chance to read and respond yet so I'm gonna get off this train for a while. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779473.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNNEST with multiple args, and TABLE with multiple funcs
Robert Haas wrote select * from table(array(select generate_series(10,20,5)), array['fred','jim']); Can we have our arrays and eat our functions too? (and is someone willing to bake such a complicated cake...) select * from table ( ARRAY | FUNCTION/SET [, ARRAY | FUNCTION/SET ]* ) The standard-compliant case is handled as required - and those who want to write compliant code can use the array(select function) trick - while others can avoid straining their eyes and fingers. Since we would have to invent implicit unnesting anyway to conform, and the function version is working currently, the suggested behavior would seem to be the ideal target. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/UNNEST-with-multiple-args-and-TABLE-with-multiple-funcs-tp5767280p5779512.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNNEST with multiple args, and TABLE with multiple funcs
Tom Lane-2 wrote Andrew Gierth lt; andrew@.org gt; writes: Tom == Tom Lane lt; tgl@.pa gt; writes: Tom and this would result in producing the array elements as a table Tom column. There is nothing in there about a function returning Tom set. In the spec, there is no such thing as a function returning a set of rows in the sense that we use. Right, but they do have a concept of arrays that's similar to ours, and AFAICS the spec demands different behavior for an array-returning function than what we've got here. We could conceivably say that we'll implicitly UNNEST() if the function returns array, and not otherwise --- but that seems pretty inconsistent and surprise-making to me. I'm not too sure what to do if a function returns setof array, either. If a function returns a scalar array (RETURNS text[]) we would unnest the array per-spec. If it returns a set (RETURN setof anything {including a single array}) we would not unnest it since set returning functions are non-spec - instead we'd use our SRF processing routine. If the function returns a scalar non-array the implicit single-row returned by the function would be output. How would the spec interpret: CREATE FUNCTION f(IN text, OUT text[]) RETURNS record AS $$ ... TABLE( f('id_123') ) If that is illegal because the result is not just a single array value then we would not unnest the component array and would also output the implicit single-row. My $0.02, quickly gathered David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/UNNEST-with-multiple-args-and-TABLE-with-multiple-funcs-tp5767280p5779515.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNNEST with multiple args, and TABLE with multiple funcs
Tom Lane-2 wrote We could conceivably say that we'll implicitly UNNEST() if the function returns array, and not otherwise --- but that seems pretty inconsistent and surprise-making to me. The use-cases for putting a scalar array returning function call into a TABLE construct, and NOT wanting the array to be un-nested, are likely few and far between. Neither the inconsistency nor surprise-making are serious deal-breakers for me. And if we do go with the screw the standard approach then we should just state right now that we will never adhere to standard on inconsistency grounds and not even encourage others to make it work. If TABLE( array_scalar_func() ) ends up only returning a single row then nothing can be done to make it unnest the array and conform with the syntax without breaking backward compatibility. I'd rather change TABLE to FUNCTION and leave the implementation of TABLE open for future standards-compliance - which maybe you do as well and just haven't carried that sentiment to your more recent responses David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/UNNEST-with-multiple-args-and-TABLE-with-multiple-funcs-tp5767280p5779518.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is UPDATE with column-list syntax not implemented
AK wrote 9.3 documentation says: According to the standard, the column-list syntax should allow a list of columns to be assigned from a single row-valued expression, such as a sub-select: UPDATE accounts SET (contact_last_name, contact_first_name) = (SELECT last_name, first_name FROM salesmen WHERE salesmen.id = accounts.sales_id); This is not currently implemented — the source must be a list of independent expressions. Why is this not implemented? Is it considered inconvenient to use, or difficult to implement. or not important enough, or some other reason? I cannot answer why but I too would like to see this. I actually asked this a long while back but cannot seem to find my posting or recall the response. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Why-is-UPDATE-with-column-list-syntax-not-implemented-tp5779600p5779601.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
AK wrote Kevin, I do see your logic now, but this thing is a common mistake - it means that this seems counter-intuitive to some people. What would happen if we applied Occam's razor and just removed this rule? All existing code would continue to work as is, and we would have one less rule to memorize. That would make PostgreSql a slightly better product, right? I'm somewhat on the fence for this but am leaning toward maintaining status-quo. Mostly because of the analogy with IF ... END IF; versus the SQL BEGIN; command which is a entirely separate construct. I would maybe change the documentation so that instead of simply dictating a rule we explain why the syntax is the way it is - like this thread is doing. If they consciously omit the semi-colon hopefully they also understand that what they are beginning is a code-block in plpgsql as opposed to an SQL transaction. That said, technical purity isn't always a good answer. I'd be inclined to let someone passionate enough about the idea implement it an critique instead of dis-allowing it outright; but in the end that is likely to result in the same end. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780222.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
Mark Kirkwood-2 wrote Postgres supports many procedural languages (e.g plperl, plpython) and all these have different grammar rules from SQL - and from each other. We can't (and shouldn't) try altering them to be similar to SQL - it would defeat the purpose of providing a procedural environment where the given language works as advertised. So in the case of plpgsql - it needs to follow the Ada grammar, otherwise it would be useless. I do not follow the useless conclusion - what, present day, does Ada got to do with it? And the request is to alter only plpgsql, not all the other languages. To the casual end-user plpgsql is an internal language under our full control and installed by default in all new releases. Is it really unreasonable to expect us to design in some level of coordination between it and SQL? Cross-compatibility is a valid reason though I'm guessing with all the inherent differences between our standard PL and other database's PLs that making this change would not be a materially noticeable additional incompatibility. I'll even accept language consistency and not worth the effort of special-casing but mostly because the error is immediate and obvious, and the solution is simple and readily learned. A side observation: why does DECLARE not require a block-end keyword but instead BEGIN acts as effectively both start and end? BEGIN, IF, FOR, etc... all come in pairs but DECLARE does not. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780245.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
Andrew Dunstan wrote On 11/25/2013 06:13 PM, David Johnston wrote: A side observation: why does DECLARE not require a block-end keyword but instead BEGIN acts as effectively both start and end? BEGIN, IF, FOR, etc... all come in pairs but DECLARE does not. A complete block is: [ DECLARE declarations ] BEGIN statements [ EXCEPTIONS handlers ] END The declare and exceptions parts are optional, as indicated. Does that make it clearer? Doh! IF / THEN / ELSE / ENDIF (concept, not syntax) That also does help to reinforce the point being made here... David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780250.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers