Re: [HACKERS] Proposal: knowing detail of config files via SQL
On Fri, Jan 30, 2015 at 12:05 PM, David Fetter da...@fetter.org wrote: On Sat, Jan 31, 2015 at 12:50:20AM +0900, Sawada Masahiko wrote: [postgres][5432](1)=# select * from pg_file_settings where name = 'work_mem'; -[ RECORD 1 ]-- name | work_mem setting| 128MB sourcefile | /home/masahiko/pgsql/master/3data/hoge.conf -[ RECORD 2 ]-- name | work_mem setting| 64MB sourcefile | /home/masahiko/pgsql/master/3data/postgresql.auto.conf Masuhiko-san, I apologize for not communicating clearly. My alternative proposal is to add a NULL-able sourcefile column to pg_settings. This is as an alternative to creating a new pg_file_settings view. Why might the contents of pg_settings be different from what would be in pg_file_settings, apart from the existence of this column? The contents of pg_settings uses the setting name as a primary key. The contents of pg_file_setting uses a compound key consisting of the setting name and the source file. See work_mem in the provided example. More specifically pg_settings only care about the currently active setting while this cares about inactive settings as well. David J.
Re: [HACKERS] Small bug on CREATE EXTENSION pgq...
On Wednesday, January 28, 2015, Stephen Frost sfr...@snowman.net wrote: * David G Johnston (david.g.johns...@gmail.com javascript:;) wrote: Jerry Sievers-3 wrote Hackers; I noticed this trying to import a large pg_dump file with warnings supressed. It seems loading pgq sets client_min_messages to warning and leaves it this way which defeats an attempt to change the setting prior and have it stick. I tested with several other extensions in same DB and only pgq has the problem. Sorry if this is known/fixed already. This is not part of PostgreSQL proper and thus not supported by -hackers; you should report this directly to the authors. Ehh.. Shouldn't we try to take a bit more care that we reset things after a CREATE EXTENSION is run? Not really sure how much effort we want to put into it, but I see a bit of blame on both sides here. Fair enough but reset to what? I don't know the internal mechanics but if the session default is warning and a local change sets it to notice then an unconditional reset would not get us back to the intended value. Now, if extensions can be handled similarly to how functions operate, where one can define function/extension -local settings and have them revert after resolution that might be ok. That said, while there isn't any way to prevent it the log_min GUCs should not be changed by extensions; that decision should be left up to the user. The complaint is not that it should be reset but that the installation script should not even care what the setting is. David J.
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote regression=# alter system reset timezone; ALTER SYSTEM regression=# select pg_reload_conf(); How does someone know that performing the above commands will result in the TimeZone setting being changed from Asia/Shanghai to US/Eastern? Is that a requirement, and if so why? Because this proposal doesn't guarantee any such knowledge AFAICS. The proposal provides for SQL access to all possible sources of variable value setting and, ideally, a means of ordering them in priority order, so that a search for TimeZone would return two records, one for postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and 2 respectively - so that in looking at that result if the postgresql.auto.conf entry were to be removed the user would know that what the value is in postgresql.conf that would become active. Furthermore, if postgresql.conf has a setting AND there is a mapping in an #included file that information would be accessible via SQL as well. David J.
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On Thu, Jan 22, 2015 at 3:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: David Johnston david.g.johns...@gmail.com writes: On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Is that a requirement, and if so why? Because this proposal doesn't guarantee any such knowledge AFAICS. The proposal provides for SQL access to all possible sources of variable value setting and, ideally, a means of ordering them in priority order, so that a search for TimeZone would return two records, one for postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and 2 respectively - so that in looking at that result if the postgresql.auto.conf entry were to be removed the user would know that what the value is in postgresql.conf that would become active. Furthermore, if postgresql.conf has a setting AND there is a mapping in an #included file that information would be accessible via SQL as well. I know what the proposal is. What I am questioning is the use-case that justifies having us build and support all this extra mechanism. How often does anyone need to know what the next down variable value would be? And if they do need to know whether a variable is set in postgresql.conf, how often wouldn't they just resort to grep instead? (Among other points, grep would succeed at noticing commented-out entries, which this mechanism would not.) GUC has existed in more or less its current state for about 15 years, and I don't recall a lot of complaints that would be solved by this. Furthermore, given that ALTER SYSTEM was sold to us as more or less obsoleting manual editing of postgresql.conf, I rather doubt that it's changed the basis of discussion all that much. i doubt we'd actually see many complaints since, like you say, people are likely to just use a different technique. The only thing PostgreSQL itself needs to provide is a master inventory of seen/known settings; the user interface can be left up to other layers (psql, pgadmin, extensions, etc...). It falls into making the system more user friendly. But maybe the answer for those users is that if you setup is so complex this would be helpful you probably need to rethink your setup. Then again, if I only interact with the via SQL at least can issue RESET and know the end result - after a config reload - without having to log into the server and grep the config file - or know what the system defaults are for settings that do not appear explicitly in postgresql.conf; all without having to refer to documentation as well. I'm doubtful it is going to interest any of the core hackers to put this in place but it at least warrants a couple of passes of brain-storming which can then be memorialized on the Wiki-ToDo. David J.
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Tue, Jan 20, 2015 at 1:24 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 1/16/15 10:32 PM, David G Johnston wrote: Two changes solve this problem in what seems to be a clean way. 1) Upon each parsing of postgresql.conf we store all assigned variables somewhere Parsing is relatively cheap, and it's not like we need high performance from this. So, -1 on permanent storage. OK 2) We display these assignments in a new pg_settings column named system_reset_val I would also extend this to include: a) upon each parsing of postgresql.auto.conf we store all assigned variables somewhere (maybe the same place as postgresql.conf and simply label the file source) You can not assume there are only postgresql.conf and postgresql.auto.conf. Complex environments will have multiple included files. #include files still appear to come from postgresql.conf; I'm not proposing we try and memorize every single instance of a variable declaration and provide a global overlaps query - though that piece already seems to be in place... b) add an alter_system_val field to show that value (or null) c) add a db_role_val to show the current value for the session via pg_db_role_setting You're forgetting that there are also per-role settings. And I'm with Robert; what's wrong with sourcefile and sourceline? Perhaps we just need to teach those about ALTER ROLE SET and ALTER DATABASE SET (if they don't already know about them). Actually, there are not separate PER ROLE and PER DATABASE settings even though there are different SQL commands. The catalog is simply pg_db_role_setting with the use of all tags (i.e., 0) as necessary. But I see where you are going and do not disagree that precedence information could be useful. sourceline and sourcefile pertain only to the current value while the point of adding these other pieces is to provide a snapshot of all the different mappings that the system knows about; instead of having to tell a user to go look in two different files (and associated includes) and a database catalog to find out what possible values are in place. That doesn't solve the need to scan the catalog to see other possible values - though you could at least put a counter in pg_settings that indicates how many pg_db_role_setting entries reference the given variable so that if non-zero the user is clued into the fact that they need to check out said catalog table. c.1) add a db_role_id to show the named user that is being used for the db_role_val lookup The thinking for c.1 is that in situations with role hierarchies and SET ROLE usage it would be nice to be able to query what the connection role - the one used during variable lookup - is. I'm losing track of exactly what we're trying to solve here, but... If the goal is to figure out what settings would be in place for a specific user connecting to a specific database, then we should create a SRF that does just that (accepting a database name and role name). You could then do... SELECT * FROM pg_show_all_settings( 'database', 'role' ) a; This is fine - but I'm thinking about situations where a user immediately SET ROLE on their session and typically operate as said user; if they try doing an ALTER ROLE SET for this SET ROLE user it will not work because their login user is what matters. This is probably a solution looking for a problem but it is a dynamic that one needs to be aware of. I'm probably going overkill on this but there are not a lot of difference sources nor do they change frequently so extending the pg_settings view to be more of a one-stop-shopping for this information seems to make sense. Speaking of overkill... one thing that you currently can't do is find out what #includes have been processed. Perhaps it's worth having a SRF that would return that info... As it relates back to this thread the desired merging ends up being done inside this view and at least gives the DBA a central location (well, along with pg_db_role_setting) to go and look at the configuration landscape for the system. I think the goal is good, but the interface needs to be rethought. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com My main concern with the UI would be too-much-information; but otherwise if we accept the premise that we should include as much as possible and then let the user (or us) provide more useful subsets based upon common use-cases the main issue is making sure we've identified all of the relevant information that needs to be captured - even if it is something as minor as the whole logon vs. current role dynamic. I'm ignoring the cost/benefit aspect of implementation for the moment largely because I do known enough about the backend to make reasonable comments. But much of the data is already present in various views and catalogs - I just think having one-stop-shop would be useful and go a long
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Tue, Jan 20, 2015 at 8:46 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jan 20, 2015 at 4:07 PM, David Johnston david.g.johns...@gmail.com wrote: sourceline and sourcefile pertain only to the current value while the point of adding these other pieces is to provide a snapshot of all the different mappings that the system knows about; instead of having to tell a user to go look in two different files (and associated includes) and a database catalog to find out what possible values are in place. That doesn't solve the need to scan the catalog to see other possible values - though you could at least put a counter in pg_settings that indicates how many pg_db_role_setting entries reference the given variable so that if non-zero the user is clued into the fact that they need to check out said catalog table. This last proposal seems pointless to me. If the user knows about pg_db_role_setting, they will know to check it; if they don't, a counter won't fix that. I can see that there might be some utility to a query that would tell you, for a given setting, all sources of that setting the system knew about, whether in configuration files, pg_db_role_setting, or the current session. But I don't see that putting information that's already available via one system catalog query into a different system catalog query helps anything - we should presume DBAs know how to write SQL. While this is not exactly a high-traffic catalog/view why have them write a likely 4-way join query (pg_db_role_setting is not the friendly in its current form), or even two separate queries, when we can give them a solid and comprehensive view standard. I guess part of the mixup is that I am basically talking about a view of multiple catalogs as a DBA UI as opposed to really even caring what specific catalogs (existing or otherwise) or functions are needed to make the whole thing work. Maybe it does all fit directly on pg_settings but tacking on some read-only columns to this updateable view/table doesn't come across as something that should be forbidden in general. Maybe I am imagining a use-case that just isn't there but if there are two separate queries needed, and we call one consolidated, then having that query indicate whether the other query has useful information, and it is quite likely that it will not, avoids the user expending the effort to run the wasteful secondary query. David J.
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Fri, Jan 16, 2015 at 10:08 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Sat, Jan 17, 2015 at 10:02 AM, David G Johnston david.g.johns...@gmail.com wrote: You're right. pg_setting and SHOW command use value in current session rather than config file. It might break these common infrastructure. Two changes solve this problem in what seems to be a clean way. 1) Upon each parsing of postgresql.conf we store all assigned variables somewhere 2) We display these assignments in a new pg_settings column named system_reset_val I would also extend this to include: a) upon each parsing of postgresql.auto.conf we store all assigned variables somewhere (maybe the same place as postgresql.conf and simply label the file source) Do we need to perform this parsing whenever user queries pg_settings? I think it might lead to extra cycles of reading file when user won't even need it and as the code is shared with SHOW commands that could be slightly complicated. There would be no parsing upon reading of pg_settings, only lookups. The existing parsing would simply have its values saved to the catalogs that will be involved in the underlying pg_setting view query. David J.
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Fri, Jan 16, 2015 at 10:19 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Sat, Jan 17, 2015 at 10:41 AM, David Johnston david.g.johns...@gmail.com wrote: On Fri, Jan 16, 2015 at 10:08 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Sat, Jan 17, 2015 at 10:02 AM, David G Johnston david.g.johns...@gmail.com wrote: You're right. pg_setting and SHOW command use value in current session rather than config file. It might break these common infrastructure. Two changes solve this problem in what seems to be a clean way. 1) Upon each parsing of postgresql.conf we store all assigned variables somewhere 2) We display these assignments in a new pg_settings column named system_reset_val I would also extend this to include: a) upon each parsing of postgresql.auto.conf we store all assigned variables somewhere (maybe the same place as postgresql.conf and simply label the file source) Do we need to perform this parsing whenever user queries pg_settings? I think it might lead to extra cycles of reading file when user won't even need it and as the code is shared with SHOW commands that could be slightly complicated. There would be no parsing upon reading of pg_settings, only lookups. The existing parsing would simply have its values saved to the catalogs that will be involved in the underlying pg_setting view query. So are you telling that whenever we read, save the settings to some catalog (probably a new one)? Will that completely address the problem specified in this thread, as those values could probably be old (when last time server is started or at last SIGHUP time values)? Cache invalidation is a hard problem to solve :) I am reasonably content with showing the user who has just updated their postgresql.conf file and boots/SIGHUPs the server to find that said value hasn't taken effect that their value is indeed sitting in postgresql.conf but that other parts of the system are preempting it from being the active value. David J.
Re: [HACKERS] Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs
On Tue, Jan 6, 2015 at 4:15 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 1/6/15, 10:32 AM, Alvaro Herrera wrote: Tom Lane wrote: What would make sense to me is to teach the planner about inlining SQL functions that include ORDER BY clauses, so that the performance issue of a double sort could be avoided entirely transparently to the user. Wouldn't this be applicable to functions in other languages too, not only SQL? Dumb question... we can inline functions from other languages? What chunk of code handles that? We cannot that I know of. The point being made here is that suggesting an alternative that requires inlining doesn't cover the entire purpose of this feature since the feature can be applied to functions that cannot be inlined. David J.
Re: [HACKERS] [GENERAL] ON_ERROR_ROLLBACK
On Tue, Dec 30, 2014 at 8:54 AM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 12/30/2014 07:43 AM, David G Johnston wrote: Tom Lane-2 wrote Bernd Helmle lt; mailings@ gt; writes: --On 29. Dezember 2014 12:55:11 -0500 Tom Lane lt; tgl@.pa gt; wrote: Given the lack of previous complaints, this probably isn't backpatching material, but it sure seems like a bit of attention to consistency would be warranted here. Now that i read it i remember a client complaining about this some time ago. I forgot about it, but i think there's value in it to backpatch. Hm. Last night I wrote the attached draft patch, which I was intending to apply to HEAD only. The argument against back-patching is basically that this might change the interpretation of scripts that had been accepted silently before. For example \set ECHO_HIDDEN NoExec will now select noexec mode whereas before you silently got on mode. In one light this is certainly a bug fix, but in another it's just definitional instability. If we'd gotten a field bug report we might well have chosen to back-patch, though, and perhaps your client's complaint counts as that. Opinions anyone? -0.5 for back patching The one thing supporting this is that we'd potentially be fixing scripts that are broken but don't know it yet. But the downside of changing active settings for working scripts - even if they are only accidentally working - is enough to counter that for me. Being more liberal in our acceptance of input is more feature than bug fix even if we document that we accept more items. It is more about being consistent then liberal. Personally I think a situation where for one variable 0 = off but for another 0 = on, is a bug I can sorta buy the consistency angle but what will seal it for me is script portability - the ability to write a script and instructions using the most current release and have it run on previous versions without having to worry about this kind of incompatibility. So, +1 for back patching from me. David J.
Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)
On Thu, Dec 11, 2014 at 2:05 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Dec 11, 2014 at 10:04:43AM -0700, David G Johnston wrote: Tom Lane-2 wrote Robert Haas lt; robertmhaas@ gt; writes: On Thu, Dec 11, 2014 at 1:18 AM, Josh Berkus lt; josh@ gt; wrote: While there were technical issues, 9.4 dragged a considerable amount because most people were ignoring it in favor of 9.5 development. I think 9.4 dragged almost entirely because of one issue: the compressibility of JSONB. 2. The amount of pre-release testing we get from people outside the hard-core development crowd seems to be continuing to decrease. We were fortunate that somebody found the JSONB issue before it was too late to do anything about it. Personally, I'm very worried that there are other such bugs in 9.4. But I've given up hoping that any more testing will happen until we put out something that calls itself 9.4.0, which is why I voted to release in the core discussion about it. The compressibility properties of a new type seem like something that should be mandated before it is committed - it shouldn't require good fortune that Odd are the next problem will have nothing to do with compressibility --- we can't assume old failure will repeat themselves. tl;dr: assign two people, a manager/curator and a lead reviewer. Give the curator better tools and the responsibility to engage the community. If the primary reviewer cannot review a patch in the current commitfest it can be marked awaiting reviewer and moved to the next CF for evaluation by the next pair of individuals. At minimum, though, if it does get moved the manager AND reviewer need to comment on why it was not handled during their CF. Also, formalize documentation targeting developers and reviewers just like the documentation for users has been formalized and committed to the repository. That knowledge and history is probably more important that source code commit log and definitely should be more accessible to newcomers. While true a checklist of things to look for and evaluate when adding a new type to the system still has value. How new types interact, if at all, with TOAST seems like something that warrants explicit attention before commit, and there are probably others (e.g., OIDs, input/output function volatility and configuration, etc...). Maybe this exists somewhere but it you are considering improvements to the commitfest application having an top-of-page always-visible checklist that can bootstrapped based upon the patch/feature type and modified for specific nuances for the item in question seems like it would be valuable. If this was in place before 9.3 then the whatever category the multi-xact patches fall into would want to have their template modified to incorporate the various research and testing - along with links to the discussions - the resulted from the various bug reports that were submitted. It could even be structured to both be an interactive checklist as well as acting as curated documentation for developers and reviewers. The wiki has some of this but if the goal is to encourage people to learn how to contribute to PostgreSQL it should receive a similar level of attention and quality control that our documentation for people wanting to learn how to use PostgreSQL receive. But that is above and beyond the simple goal of having meaningful checklists attached to each of the major commit-fest items whose TODO items can be commented upon and serve as a reference for how close to commit a feature may be. Comments can be as simple as a place for people to upload a psql script and say this is what I did and everything seemed to work/fail in the way I expected - on this platform. Curation is hard though so I get why easier - just provide links to the mailing list mainly - actions are what is currently being used. Instead of the CF manager being a reviewer (which is a valid approach) having them be a curator and providing better tools geared toward that role (both to do the work and to share the results) seem like a better ideal. Having that role a CF reviewer should maybe also be assigned. The two people - reviewer and manager - would then be responsible for ensuring that reviews are happening and that the communication to and recruitment from the community is being handled - respectively. Just some off-the-cuff thoughts... David J.
Re: [HACKERS] controlling psql's use of the pager a bit more
On Thu, Nov 13, 2014 at 10:55 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Nov 13, 2014 at 11:39 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Nov 13, 2014 at 11:54 AM, David G Johnston david.g.johns...@gmail.com wrote: Because I might be quite happy with 100 or 200 lines I can just scroll in my terminal's scroll buffer, but want to use the pager for more than that. This is useful especially if I want to scroll back and see the results from a query or two ago. +1 +1 from me, too. me three (as long as the current behaviors are not messed with and the new stuff is 'opt in' somehow -- I also use the force quit option to less). Being able to use \pset pager to toggle the capability seems useful. Thus I'd suggest establishing a new pager_minlines option that if unset (default) maintains the current behavior but which can have a value (0 to int_max) where 0 ends up basically doing the same thing as always mode for pager. Leaving the value blank will cause the option to be unset again and revert to the current behavior. David J.
Re: [HACKERS] Re: [GENERAL] Performance issue with libpq prepared queries on 9.3 and 9.4
On Thu, Nov 13, 2014 at 5:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote In the meantime, I assume that your real data contains a small percentage of values other than these two? If so, maybe cranking up the statistics target would help. If the planner knows that there are more than two values in the column, I think it would be less optimistic about assuming that the comparison value is one of the big two. Is there any value (or can value be added) in creating a partial index of the form: archetype IN ('banner','some other rare value') such that the planner will see that such a value is possible but infrequent and will, in the presence of a plan using a value contained in the partial index, refuse to use a generic plan knowing that it will be unable to use the very specific index that the user created? The existence of such an index wouldn't alter the planner's statistics. In theory we could make it do so, but I seriously doubt the cost-benefit ratio is attractive, either as to implementation effort or the added planning cost. [adding -general back in...] While planner hints comes to mind...on the SQL side can we extend the PREPARE command with two additional keywords? PREPARE name [ ( data_type [, ...] ) ] [ [NO] GENERIC ] AS statement I was originally thinking this could attach to EXECUTE and maybe it could there as well. If EXECUTE is bare whatever the PREPARE used would be in effect (a bare PREPARE exhibiting the current dynamic behavior). If EXECUTE and PREPARE disagree execute wins and the current call is (re-)prepared as requested. We have introduced intelligence to PREPARE/EXECUTE that is not always favorable but provide little way to override it if the user has superior knowledge. The dual role of prepared statements to both prevent SQL-injection as well as create cache-able generic plans further complicates things. In effect by supplying NO GENERIC on the PREPARE the caller is saying they only wish to make use of the SQL-injection aspect of prepared statements. Adding the EXECUTE piece allows for the same plan to be used in injection-prevention mode if the caller knows that the user-supplied value does not play well with the generic plan. David J.
Re: [HACKERS] idea: allow AS label inside ROW constructor
On Thu, Oct 23, 2014 at 8:51 AM, Andrew Dunstan and...@dunslane.net wrote: On 10/23/2014 11:36 AM, David G Johnston wrote: Andrew Dunstan wrote On 10/23/2014 09:57 AM, Florian Pflug wrote: On Oct23, 2014, at 15:39 , Andrew Dunstan lt; andrew@ gt; wrote: On 10/23/2014 09:27 AM, Merlin Moncure wrote: On Thu, Oct 23, 2014 at 4:34 AM, Pavel Stehule lt; pavel.stehule@ gt; wrote: postgres=# select row_to_json(row(10 as A, row(30 as c, 20 AS B) as x)); row_to_json -- {a:10,x:{c:30,b:20}} (1 row) wow -- this is great. I'll take a a look. Already in 9.4: andrew=# select json_build_object('a',10,'x',json_build_object('c',30,'b',20)); json_build_object {a : 10, x : {c : 30, b : 20}} (1 row) So I'm not sure why we want another mechanism unless it's needed in some other context. I've wanted to name the field of rows created with ROW() on more than one occasion, quite independent from whether the resulting row is converted to JSON or not. And quite apart from usefulness, this is a matter of orthogonality. If we have named fields in anonymous record types, we should provide a convenient way of specifying the field names. So to summarize, I think this is an excellent idea, json_build_object non-withstanding. Well, I think we need to see those other use cases. The only use case I recall seeing involves the already provided case of constructing JSON. Even if it simply allows CTE and sibqueries to form anonymous record types which can then be re-expanded in the outer layer for table-like final output this feature would be useful. When working with wide tables and using multiple aggregates and joins being able to avoid specifying individual columns repeatedly is quite desirable. It would be especially nice to not have to use as though, if the source fields are already so named. You can already name the output of CTEs and in many cases subqueries, too. Maybe if you or someone gave a concrete example of something you can't do that this would enable I'd be more convinced. cheers andrew Mechanically I've wanted to do the following without creating an actual type: {query form} WITH invoiceinfo (invoiceid, invoicedetailentry) AS ( SELECT invoiceid, ROW(itemid, itemdescription, itemcost, itemsale, itemquantity) FROM invoicelines ) [... other CTE joins and stuff here...can carry around the 5 info fields in a single composite until they are needed] SELECT invoiceid, (invoicedetailentry).* FROM invoiceinfo ; {working example} WITH invoiceinfo (invoiceid, invoicedetailentry) AS ( SELECT invoiceid, ROW(itemid, itemdescription, itemcost, itemsale, itemquantity) FROM (VALUES ('1',1,'1',0,1,1)) invoicelines (invoiceid, itemid, itemdescription, itemcost, itemsale, itemquantity) ) SELECT invoiceid, (invoicedetailentry).* FROM invoiceinfo ; This is made up but not dissimilar to what I have worked with. That said I can and do usually either just join in the details one time or I need to do more with the details than just carry them around and so providing a named type usually ends up being the way to go. Regardless the form is representative. My most recent need for this ended up being best handled with named types and support functions operating on those types so its hard to say I have a strong desire for this but anyway. David J.
Re: [HACKERS] Trailing comma support in SELECT statements
We might as well allow a final trailing (or initial leading) comma on a values list at the same time: VALUES (...), (...), (...), do you know, so this feature is a proprietary and it is not based on ANSI/SQL? Any user, that use this feature and will to port to other database will hate it. Regards Pavel I've got no complaint if at the same time means that neither behavior is ever implemented... David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Fri, Sep 26, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: If we want the narrowest possible fix for this, I think it's complain if a non-zero value would round to zero. That fixes the original complaint and changes absolutely nothing else. But I think that's kind of wussy. Yeah, rounding 29 seconds down to a special magic value of 0 is more surprising than rounding 30 seconds up to a minute, but the latter is still surprising. We're generally not averse to tighter validation, so why here? So in other words, if I set shared_buffers = 100KB, you are proposing that that be rejected because it's not an exact multiple of 8KB? This seems like it's throwing away one of the fundamental reasons why we invented GUC units in the first place. Both Robert and myself at one point made suggestions to this effect but I believe at this point we are both good simply solving the 1 rounding and calling it a day. I apparently have got to make this point one more time: if the user cares about the difference between 30sec and 1min, then we erred in designing the GUC in question; it should have had a smaller unit. I am completely not impressed by arguments based on such cases. The right fix for such a case is to choose a different unit for the GUC. You are right - both those values are probably quite stupid to set log_rotation_age to...but we cannot error if they choose 1min Stephen suggested changing the unit but that particular cure seems to be considerably worse than simply changing floor() to ceil() I'm out of arguments for why the minimally invasive solution is, for me, the best of the currently proposed solutions. That option gets my +1. I have to ask someone else to actually write such a patch but it seems straight forward enough as no one has complained that the solution would be hard to implement - only that they dislike the idea. David J
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Fri, Sep 26, 2014 at 2:02 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 26, 2014 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: If we want the narrowest possible fix for this, I think it's complain if a non-zero value would round to zero. That fixes the original complaint and changes absolutely nothing else. But I think that's kind of wussy. Yeah, rounding 29 seconds down to a special magic value of 0 is more surprising than rounding 30 seconds up to a minute, but the latter is still surprising. We're generally not averse to tighter validation, so why here? So in other words, if I set shared_buffers = 100KB, you are proposing that that be rejected because it's not an exact multiple of 8KB? Absolutely. And if anyone is inconvenienced by that, then they should upgrade to a 386. Seriously, who is going to set a value of shared_buffers that is not measured in MB? And if they do, why shouldn't we complain if we can't honor the value exactly? If they really put in a value that small, it's not stupid to think that the difference between 96kB and 104kB is significant. Honestly, the most likely explanation for that value is that it's a developer doing testing. Related thought - why don't we allow the user to specify 1.5MB as a valid value? Since we don't the rounding on the 8kb stuff makes sense because not everyone wants to choose between 1MB and 2MB. A difference of 1 minute is not as noticeable. In the thread Tom linked to earlier the whole idea of a unit being 8kb (instead 1 block) is problematic and this is just another symptom of that. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Fri, Sep 26, 2014 at 2:27 PM, Stephen Frost sfr...@snowman.net wrote: Agreed- they're independent considerations and the original concern was about the nonzero-to-zero issue, so I'd suggest we address that first, though in doing so we will need to consider what *actual* min values we should have for some cases which currently allow going to zero for the special case and that, I believe, makes this all 9.5 material and allows us a bit more freedom in deciding how to hanlde things more generally. This is 9.5 material because 1) it isn't all that critical and, 2) we DO NOT want a system to not come up because of a GUC paring error after a minor-release update. I don't get where we need to do anything else besides that...the whole actual min values comment is unclear to me. My thought on rounding is simply no-complaint, no-change; round-to-nearest would be my first choice if implementing anew. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Friday, September 26, 2014, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Tom Lane wrote: The impression I had was that Stephen was thinking of actually setting min_val to 1 (or whatever) and handling zero or -1 in some out-of-band fashion, perhaps by adding GUC flag bits showing those as allowable special cases. I'm not sure how we would display such a state of affairs in pg_settings, but other than that it doesn't sound implausible. I would think that if we're going to have out of band values, we should just use off, not 0 or -1. Except off is not always semantically correct - some GUCs (not sure which ones ATM) use zero to mean 'default'. I think -1 always means off. Instead of 0 and -1 we'd need 'default' and 'off' with the ability to error if there is no meaningful default defined or if a feature cannot be turned off. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Friday, September 26, 2014, Alvaro Herrera alvhe...@2ndquadrant.com wrote: David Johnston wrote: On Friday, September 26, 2014, Alvaro Herrera alvhe...@2ndquadrant.com javascript:; wrote: Tom Lane wrote: The impression I had was that Stephen was thinking of actually setting min_val to 1 (or whatever) and handling zero or -1 in some out-of-band fashion, perhaps by adding GUC flag bits showing those as allowable special cases. I'm not sure how we would display such a state of affairs in pg_settings, but other than that it doesn't sound implausible. I would think that if we're going to have out of band values, we should just use off, not 0 or -1. Except off is not always semantically correct - some GUCs (not sure which ones ATM) use zero to mean 'default'. I think -1 always means off. Instead of 0 and -1 we'd need 'default' and 'off' with the ability to error if there is no meaningful default defined or if a feature cannot be turned off. Sure, off (and other spellings of boolean false value) and default where they make sense, and whatever other values that make sense. My point is to avoid collapsing such logical values to integer/floating point values just because the option uses a numeric scale. My comment was more about storage than data entry. I'm not sure, though, that we'd want to allow 0 as valid input even if it is acceptable for Boolean. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thursday, September 25, 2014, Gregory Smith gregsmithpg...@gmail.com wrote: On 9/25/14, 1:41 AM, David Johnston wrote: If the error message is written correctly most people upon seeing the error will simply fix their configuration and move on - regardless of whether they were proactive in doing so having read the release notes. The important part to realize here is that most people will never see such an error message. There is a person/process who breaks the postgresql.conf, a process that asks for a configuration restart/reload (probably via pg_ctl, and then the postmaster program process creating a server log entry that shows the error (maybe in pgstartup.log, maybe in pg_log, maybe in syslog, maybe in the Windows Event Log) In practice, the top of that food chain never knows what's happening at the bottom unless something goes so seriously wrong the system is down, [...] And if the GUC setting here is wrong the system will be down, right? Otherwise the attempt at changing the setting will fail and so even if the message itself is not seen the desired behavior of the system will remain as it was - which is just as valid a decision rounding to zero or 1. Just like we don't take responsibility for people not reading release notes or checking their configuration if the DBA is not aware of where the GUCs are being set and the logging destinations that not our problem. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thu, Sep 25, 2014 at 12:11 AM, Gregory Smith gregsmithpg...@gmail.com wrote: On 9/24/14, 6:45 PM, Peter Eisentraut wrote: But then this proposal is just one of several others that break backward compatibility, and does so in a more or less silent way. Then we might as well pick another approach that gets closer to the root of the problem. I was responding to some desire to get a quick fix that cut off the worst of the problem, and the trade-offs of the other ideas bothered me even more. Obvious breakage is annoying, but small and really subtle version to version incompatibilities drive me really crazy. A 60X scale change to log_rotation_age will be big enough that impacted users should definitely notice, while not being so big it's scary. Rounding differences are small enough to miss. And I do see whacking the sole GUC that's set in minutes, which I was surprised to find is a thing that existed at all, as a useful step forward. Why? Do you agree that a log_rotation_age value defined in seconds is sane? If your reasoning is that everything else is defined in s and ms then that is a developer, not a user, perspective. I'll buy into the everything is defined in the smallest possible unit approach - in which case the whole rounding things becomes a non-issue - but if you leave some of these in seconds then we should still add an error if someone defines an insane millisecond value for those parameters. I can't agree with Stephen's optimism that some wording cleanup is all that's needed here. I predict it will be an annoying, multiple commit job just to get the docs right, describing both the subtle rounding change and how it will impact users. (Cry an extra tear for the translators) Maybe I'm nieve but I'm seriously doubting this. From what I can tell the rounding isn't currently documented and really doesn't need much if any to be added. An error instead of rounding down to zero would be sufficient and self-contained. The value specified is less than 1 in the parameter's base unit Meanwhile, I'd wager the entire work of my log_rotation_scale unit change idea, from code to docs to release notes, will take less time than just getting the docs done on any rounding change. Probably cause less field breakage and confusion in the end too. You've already admitted there will be breakage. Your argument is that it will be obvious enough to notice. Why not just make it impossible? The bad case I threw out is a theorized one that shows why we can't go completely crazy with jerking units around, why 1000:1 adjustments are hard. I'm not actually afraid of that example being in the wild in any significant quantity. The resulting regression from a 60X scale change should not be so bad that people will suffer greatly from it either. Probably be like the big 10:1 change in default_statistics_target. Seemed scary that some people might be nailed by it; didn't turn out to be such a big deal. I don't see any agreement on the real root of a problem here yet. That makes gauging whether any smaller change leads that way or not fuzzy. I personally would be fine doing nothing right now, instead waiting until that's charted out--especially if the alternative is applying any of the rounding or error throwing ideas suggested so far. I'd say that I hate to be that guy who tells everyone else they're wrong, but we all know I enjoy it. Maybe not: I contend the root problem is that while we provide sane unit specifications we've assumed that people will always be providing values greater than 1 - but if they don't we silently use zero (a special value) instead of telling them they messed up (made an error). If the presence of config.d and such make this untenable then I'd say those features have a problem.- not our current choice to define what sane is. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane t...@sss.pgh.pa.us wrote: Gregory Smith gregsmithpg...@gmail.com writes: I don't see any agreement on the real root of a problem here yet. That makes gauging whether any smaller change leads that way or not fuzzy. I personally would be fine doing nothing right now, instead waiting until that's charted out--especially if the alternative is applying any of the rounding or error throwing ideas suggested so far. I'd say that I hate to be that guy who tells everyone else they're wrong, but we all know I enjoy it. TBH I've also been wondering whether any of these proposed cures are better than the disease. The changes that can be argued to make the behavior more sane are also ones that introduce backwards compatibility issues of one magnitude or another. And I do not have a lot of sympathy for let's not change anything except to throw an error in a case that seems ambiguous. That's mostly being pedantic, not helpful, especially seeing that the number of field complaints about it is indistinguishable from zero. Then what does it matter that we'd choose to error-out? I am personally not as scared of backwards-compatibility problems as some other commenters: I do not think that there's ever been a commitment that postgresql.conf contents will carry forward blindly across major releases. So I'd be willing to break strict compatibility in the name of making the behavior less surprising. But the solutions that have been proposed that hold to strict backwards compatibility requirements are not improvements IMHO. Or, put differently, the pre-existing behavior is fine so don't fix what isn't broken. This patch simply fixes an oversight in the original implementation - that someone might try to specify an invalid value (i.e., between 0 and 1). if 0 and -1 are flags, then the minimum allowable value is 1. The logic should have been: range [1, something]; 0 (optionally); -1 (optionally). Values abs(x) between 0 and 1 (exclusive) should be disallowed and, like an attempt to specify 0.5 (without units), should throw an error. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thu, Sep 25, 2014 at 1:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: David Johnston david.g.johns...@gmail.com writes: On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane t...@sss.pgh.pa.us wrote: TBH I've also been wondering whether any of these proposed cures are better than the disease. The changes that can be argued to make the behavior more sane are also ones that introduce backwards compatibility issues of one magnitude or another. And I do not have a lot of sympathy for let's not change anything except to throw an error in a case that seems ambiguous. That's mostly being pedantic, not helpful, especially seeing that the number of field complaints about it is indistinguishable from zero. Then what does it matter that we'd choose to error-out? The number of complaints about the *existing* behavior is indistinguishable from zero (AFAIR anyway). It does not follow that deciding to throw an error where we did not before will draw no complaints. Any change has the potential to draw complaints. For you it seems that hey, I upgraded to 9.5 and my logs are being rotated out every minute now. I thought I had that turned off is the desired complaint. Greg wants: hey, my 1 hour log rotation is now happening every minute. If the error message is written correctly most people upon seeing the error will simply fix their configuration and move on - regardless of whether they were proactive in doing so having read the release notes. And, so what if we get some complaints? We already disallow the specified values (instead, turning them into zeros) so it isn't like we are further restricting user capabilities. If I was to commit a patch that didn't throw an error I'd ask the author to provide an outline for each of the affected parameters and what it would mean (if possible) for a setting currently rounded to zero to instead be rounded to 1. Its the same language in the release notes to get someone to avoid the error as it is to get them to not be impacted by the rounding change so the difference is those people who would not read the release notes. The error outcome is simple, direct, and fool-proof - and conveys the fact that what they are asking for is invalid. It may be a little scary but at least we can be sure nothing bad is happening in the system. If the argument is that there are two few possible instances out there to expend the effort to catalog every parameter then there isn't enough surface area to care about throwing an error. And I still maintain that anyone setting values for a clean installation would rather be told their input is not valid instead of silently making it so. Changing the unit for log_rotate_age when their is basically no one asking for that seems like the worse solution at face value; my +1 not withstanding. I gave that mostly because if you are going to overhaul the system then making everything ms seems like the right thing to do. I think that such an effort would be a waste of resources. I don't have clients so maybe my acceptance of errors is overly optimistic - but in this case I truly don't see enough people even realizing that there was a change and those few that do should be able to read the error and make the same change that they would need to make anyway if the rounding option is chosen. My only concern here, and it probably applies to any solution, is that the change is implemented properly so that all possible user input areas throw the error correctly and as appropriate to the provided interface. That is, interactive use immediately errors out without changing the default values while explicit/manual file changes cause the system to error at startup. I haven't looked into the code parts of this but I have to imagine this is already covered since even with units and such invalid input is still possible and needs to be addressed; this only add one more check to that existing routine. David J.
Re: [HACKERS] RLS feature has been committed
On Tue, Sep 23, 2014 at 9:09 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-22 21:38:17 -0700, David G Johnston wrote: Robert Haas wrote It's difficult to imagine a more flagrant violation of process than committing a patch without any warning and without even *commenting* on the fact that clear objections to commit were made on a public mailing list. If that is allowed to stand, what can we assume other than that Stephen, at least, has a blank check to change anything he wants, any time he wants, with no veto possible from anyone else? I'm of a mind to agree that this shouldn't have been committed...but I'm not seeing where Stephen has done sufficient wrong to justify crucifixion. I've not seen much in the way of 'crucifixion' before this email. And I explicitly *don't* think it's warranted. Also it's not happening. I maybe got a little carried away with my hyperbole... At this point my hindsight says a strictly declaratory statement of reasons this is not ready combined with reverting the patch would have been sufficient; or even just a I am going to revert this for these reasons post. The difference between building support for a revert and gathering a mob is a pretty thin line. The reason it's being discussed is to find a way to align the different views about when to commit stuff. The primary goal is *not* to revert the commit or anything but to make sure we're not slipping into procedures we all would regret. Which *really* can happen very easily. We're all humans and most of us have more than enough to do. So, the second option then...and I'm sorry but this should never have been committed tends to cause one to think it should therefore be reverted. Though I guess if you indeed feel that his actions were truly heinous you should also then put forth the proposal that his ability to commit be revoked. I think *you* are escalating this to something unwarranted here by the way you're painting the discussion. Not everyone who reads -hackers knows all the people involved personally. I had an initial reaction to these e-mails that I thought I would share, nothing more. I'm not going to quote the different comments that led me to my feeling that the response to this was disproportionate to the offense but after a first pass - which is all many people would do - that is what I came away with. Though you could say I fell into the very same trap by reacting off my first impression... David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Tue, Sep 23, 2014 at 1:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: David Johnston david.g.johns...@gmail.com writes: My original concern was things that are rounded to zero now will not be in 9.5 if the non-error solution is chosen. The risk profile is extremely small but it is not theoretically zero. This is exactly the position I was characterizing as an excessively narrow-minded attachment to backwards compatibility. We are trying to make the behavior better (as in less confusing), not guarantee that it's exactly the same. I am going to assume that the feature designers were focused on wanting to avoid: 1000 * 60 * 5 to get a 5-minute value set on a millisecond unit parameter. The designer of the variable, in choosing a unit, has specified the minimum value that they consider sane. Attempting to record an insane value should throw an error. I do not support throwing an error on all attempts to round but specifying a value less than 1 in the variable's unit should not be allowed. If such a value is proposed the user either made an error OR they misunderstand the variable they are using. In either case telling them of their error is more friendly than allowing them to discover the problem on their own. If we are only allowed to change the behavior by throwing errors in cases where we previously didn't, then we are voluntarily donning a straitjacket that will pretty much ensure Postgres doesn't improve any further. I'm not proposing project-level policy here. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Tue, Sep 23, 2014 at 3:05 PM, Greg Stark st...@mit.edu wrote: Fwiw I agree with TL2. The simplest, least surprising behaviour to explain to users is to say we round to nearest and if that means we rounded to zero (or another special value) we throw an error. We could list the minimum value in pg_settings and maybe in documentation. I'm not sure TL2 would agree that you are agreeing with him... To the other point the minimum unit-less value is 1. The unit that is applied is already listed in pg_settings and the documentation. While 0 is allowed it is more of a flag than a value. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Tue, Sep 23, 2014 at 10:11 PM, Gregory Smith gregsmithpg...@gmail.com wrote: On 9/23/14, 1:21 AM, David Johnston wrote: This patch should fix the round-to-zero issue. If someone wants to get rid of zero as a special case let them supply a separate patch for that improvement. I am going to wander into this fresh after just reading everything once (but with more than a little practice with real-world GUC mucking) and say that, no, it should not even do that. The original complaint here is real and can be straightforward to fix, but I don't like any of the proposals so far. My suggestion: fix the one really bad one in 9.5 with an adjustment to the units of log_rotation_age. That's a small commit that should thank Tomonari Katsumata for discovering the issue and even suggesting a fix (that we don't use) and a release note; sample draft below. Stop there. +1 I'm not convinced minute is wrong but it does imply a level of user-friendliness (or over-parenting) that we can do away with. We could just change the units for log_rotation_age to seconds, then let the person who asks for a 10s rotation time suffer for that decision only with many log files. The person who instead chooses 10ms may find log rotation disabled altogether because that rounds to zero, but now we are not talking about surprises on something that seems sane--that's a fully unreasonable user request. Given the following why not just pick ms for log_rotation_age now? Following that style of fix all the way through to the sort of release notes needed would give something like this: log_rotation_age is now set in units of seconds instead of minutes. Earlier installations of PostgreSQL that set this value directly, to a value in minutes, should change that setting to use a time unit before migrating to this version. ? are there any special considerations for people using pg_dump vs. those using pg_upgrade? If I were feeling ambitious about breaking configurations with a long-term eye toward improvement, I'd be perfectly happy converting everything on this list to ms. We live in 64 bit integer land now; who cares if the numbers are bigger? Then the rounding issues only impact sub-millisecond values, making all them squarely in nonsense setting land. Users will be pushed very clearly to always use time units in their postgresql.conf files instead of guessing whether something is set in ms vs. seconds. Seeing the reaction to a units change on log_rotation_age might be a useful test for how that sort of change plays out in the real world. If we are going to go that far why not simply modify the GUC input routine to require unit-values on these particular parameters? Direct manipulation of pg_settings probably would need some attention but everything else could reject integers and non-unit-specifying text. Allow the same input routine to accept the constants on|off|default and convert them internally into whatever the given GUC requires and you get the UI benefits without mucking around with the implementation details. Modify pg_settings accordingly to hide those now-irrelevant pieces. For UPDATE a trigger can be used to enforce the text-only input requirement. As long as we do not make microsecond µs a valid time-unit it is impossible currently to directly specify a fraction of a millisecond. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Tuesday, September 23, 2014, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com javascript:; writes: Can you either change your mind back to this opinion you held last month or commit something you find acceptable - its not like anyone would revert something you commit... :) Hey, am I not allowed to change my mind :-) ? Seriously though, the main point I was making before stands: if the details of the rounding rule matter, then we messed up on choosing the units of the particular GUC. The question is what are we going to do about zero being a special case. Everyone agrees non-zero must not round to zero; as long as that happens I'm not seeing anyone willing to spending any more effort on the details. I'm not entirely sure Peter agrees; he wanted to get rid of zero being a special case, rather than worry about making the rounding rule safe for that case. But assuming that that's a minority position: it seems to me that adding a new error condition is more work for us, and more work for users too, and not an especially sane decision when viewed from a green-field perspective. My proposal last month was in response to some folk who were arguing for a very narrow-minded definition of backwards compatibility ... but I don't think that's really where we should go. regards, tom lane This patch should fix the round-to-zero issue. If someone wants to get rid of zero as a special case let them supply a separate patch for that improvement. My original concern was things that are rounded to zero now will not be in 9.5 if the non-error solution is chosen. The risk profile is extremely small but it is not theoretically zero. David J.
Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract
On Tue, Sep 9, 2014 at 12:03 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Sep 8, 2014 at 6:20 PM, David G Johnston david.g.johns...@gmail.com wrote: On Mon, Sep 8, 2014 at 11:45 AM, Robert Haas [via PostgreSQL] [hidden email] wrote: On Thu, Sep 4, 2014 at 6:38 PM, David Johnston [hidden email] wrote: Ignoring style and such did anything I wrote help to clarify your understanding of why pgPutCopyEnd does what it does? As I say this and start to try and read the C code I think that it is a good source for understanding novice assumptions but there is a gap between the docs and the code - though I'm not sure you've identified the only (or even proper) location. Honestly, not really. I thought the language I previously discussed with Tom was adequate for that; and I'm a little confused as to why we've gotten on what seems to be somewhat of a digression into nearby but otherwise-unrelated documentation issues. My theory here is that if you discover a single point of confusion that cannot by attributed to a typo (or something basic like that) then why not go looking around and see if there are any other issues in nearby code/documentation. Furthermore, not being the writer of said code/documentation, a fresh perspective of the entire body of work - and not just the few lines you quoted - has value. For me, if nothing else I took it as a chance to learn and evaluate the status quo. And as noted below there are a couple of items to address in the nearby code even if the documentation itself is accurate - I just took a very indirect route to discover them. Much of the current documentation is copy-and-pasted directly from the source code - with a few comments tacked on for clarity. I'm sure some layout and style concerns were present during the copy-and-paste but the user-facing documentation does not, and probably should not ideally, directly emulate the source code comments. Admittedly, in the libpq section this can be considerably relaxed since the target user is likely a C programmer. Unlike PQputCopyData there is no explicit logic (pqIsnonblocking(conn) in PQputCopyEnd for non-blocking mode (there is an implicit one via pqFlush). This does seem like an oversight - if a minor one since the likihood of not being able to add the EOF to the connection's buffer seems highly unlikely - but I would expect on the basis of symmetry alone - for both of them to have buffer filled testing logic. And, depending on how large *errormsg is, the risk of being unable to place the data in the buffer isn't as small and the expected EOF case. Yeah, I think that's a bug in PQputCopyEnd(). It should have the same kind of pqCheckOutBufferSpace() check that PQputCopyData() does. Anybody disagree here? I'm getting way beyond my knowledge level here but my assumption from the documentation was that the async mode behavior of returning zero revolved around retrying in order to place the data into the buffer - not retrying to make sure the buffer is empty. The caller deals with that on their own based upon their blocking mode decision. Though we would want to call pqFlush for blocking mode callers here since this function should block until the buffer is clear. PQputCopyEnd() already does flush; and PQputCopyData() shouldn't flush. My point being pgFlush is sensitive to whether the caller is in blocking mode. Also, pgPutCopyData DOES FLUSH if the buffer is full...then attempts to resize the buffer...then returns 0 (or -1 in blocking mode). How absolute is Tom's Well, we certainly do NOT want a flush in PQputCopyData. ? So, I thought I agreed with your suggestion that if the final pqFlush returns a 1 that pqPutCopyEnd should return 0. Well, Tom set me straight on that; so I don't think we're considering any such change. I think you need to make sure you understand the previous discussion in detail before proposing how to adjust the documentation (or the code). That was largely what this exercise was about for me...having gone through all this and now re-reading Tom's wording I do understand and agree. For surround documentation I think we are good since pqPutCopyData already tests the buffer and correctly returns 0 in non-blocking mode. The issue with pqPutCopyEnd seems to be a copy/paste error and an incorrect assumption regarding non-blocking mode. This doesn't address the usage bug we've been propagating where someone very well might use non-blocking mode and, upon seeing a return value of 1 from pqPutCopyEnd, immediately call pqGetResult even if the buffer is not completed flushed. Our proposed documentation directs people to always pgFlush poll after calling pqPutCopyEnd while the current documentation is not so strict. Is there anything to say/do about that fact or - more importantly - is there any real risk here? David J.
Re: [HACKERS] Re: [PATCH] parser: optionally warn about missing AS for column and table aliases
On Fri, Sep 5, 2014 at 6:27 PM, Marko Tiikkaja ma...@joh.to wrote: On 2014-09-05 11:19 PM, David G Johnston wrote: Marko Tiikkaja-4 wrote I probably couldn't mount a convincing defense of my opinion but at first blush I'd say we should pass on this. Not with prejudice - looking at the issue periodically has merit - but right now it seems to be mostly adding clutter to the code without significant benefit. Are you talking about this particular option, or the notion of parser warnings in general? Because if we're going to add a handful of warnings, I don't see why this couldn't be on that list. This option implemented in this specific manner. Tangential - I'd rather see something like EXPLAIN (STATIC) that would allow a user to quickly invoke a built-in static SQL analyzer on their query and have it point this kind of thing out. Adding a catalog for STATIC configurations in much the same way we have text search configurations would allow extensions and users to define their own rulesets that could be attached to their ROLE GUC: default_static_analyzer_ruleset and also passed in as a modifier in the EXPLAIN invocation. If you knew there's a good chance you might make a mistake, you could probably avoid doing it in the first place. I make mistakes all the time, would I then always execute two commands when I want to do something? Having to run a special check my query please command seems silly. My follow-on post posited a solution that still uses the EXPLAIN mechanism but configured in a way so users can have it be always on if desired. Anyway, the idea of using a GUC and evaluating every query that is written (the added if statements), is not that appealing even if the impact of one more check is likely to be insignificant (is it?) I'm not sure how you would do this differently in the EXPLAIN (STATIC) case. Those ifs have to be somewhere, and that's always a non-zero cost. That being said, yes, performance costs of course have to be evaluated carefully. If you add lots more checks that is lots more ifs compared to a single if to see if static analysis should be attempted in the first place. For a single option its largely a wash. Beyond that I don't know enough to say whether having the parser dispatch the query differently based upon it being preceded with EXPLAIN (STATIC) or a standard query would be any faster than a single IF for a single check. The more options you can think of for a novice mode the more appealing a formal static analysis interface becomes - both for execution and also because the number of options being introduced and managed can be made into formal configurations instead of an independent but related set of GUC variables. David J.
Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract
On Thu, Sep 4, 2014 at 5:13 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Sep 4, 2014 at 1:18 PM, David G Johnston david.g.johns...@gmail.com wrote: Specific observations would help though that is partly the idea - I've been more focused on clarity and organization even if it requires deviating from the current general documentation style. OK. - to the network connection used by applicationlibpq/application. + to the connection used by applicationlibpq/application. This change is unrelated to the topic at hand and seems not to be an improvement. Fair point - though I did question the necessity of network in the accompanying e-mail. The implied suggestion is that if I do find any other areas that look like they need fixing - even in the same file - I should separate them out into a separate patch. Though I have seen various while I was in there I also fixed such-and-such commits previously so the line is at least somewhat fluid. + note + para +Please review the function notes for specific interface protocols - +the following is a simplified overview. + /para + /note This seems pointless. Of course general documentation will be less specific than documentation for specific functions. The existing wording was being verbose in order to be correct. In a summary like this I'd trade being reasonably accurate and general for the precision that is included elsewhere. + First, the application issues the SQL commandCOPY/command command via functionPQexec/function or one + of the equivalent functions. The response + will either be an error or a structnamePGresult/ object bearing + a status code for literalPGRES_COPY_OUT/literal or + literalPGRES_COPY_IN/literal call implied by the specified copy + direction (TO or FROM respectively). This implies that the response won't be a PGresult in the error case, but of course the function has the same C result type either way. One of the trade-offs I mentioned...its more style than anything but removing the parenthetical (if there is not error in the command) and writing it more directly seemed preferable in an overview such as this. Better: The function will either throw an error or return a PGresult object[...] + para + Second, the application should use the functions in this + section to receive data rows or transmit buffer loads. Buffer loads are + not guaranteed to be processed until the copy transfer is completed. + /para The main change here vs. the existing text is that you're now using the phase buffer loads to refer to what gets transmitted, and data rows to refer to what gets received. The existing text uses the term data rows for both, which seems correct to me. My first reaction on reading your revised text was wait, what's a buffer load?. So, my generalization policy working in reverse - since the transmit side does not have to be in complete rows implying that they are here is (albeit acceptably) inaccurate. + Third, as lastly, when the data transfer is + complete the client must issue a PQgetResult to commit the copy transfer + and get the final structnamePGresult/ object that indicates I assume you mean and lastly, since as lastly doesn't sound like good English. Yep. - At this point further SQL commands can be issued via - functionPQexec/function. (It is not possible to execute other SQL - commands using the same connection while the commandCOPY/command - operation is in progress.) Removing this text doesn't seem like a good idea. It's a quite helpful clarification. The note you've added in its place doesn't seem like a good substitute for it, and more generally, I think we should avoid the overuse of constructs like note. Emphasis needs to be used minimally or it loses its meaning. Was trying to remove repetition here - happy to consider alternative way of doing so if the note is objectionable. If this is not acceptable I'm happy to incorporate the ideas of others to try and get the best of both worlds. + para +The return value of both these function can be one of [-1, 0, 1] +whose meaning depends on whether you are in blocking or non-blocking mode. + /para The use of braces to list a set of possible values is not standard in mathematics generally, our documentation, or any other documentation I have seen. Agreed + para +Non-Blocking Mode: A value of 1 means that the payload was +placed in the queue while a -1 means an immediate and permanent failure. +A return value of 0 means that the queue was full: you need to try again +at the next wait-ready. + /para We generally avoid emphasizing captions or headings in this way. The markup engine has no knowledge that Non-Blocking Mode is special; it will format and style this just as if you had written This is how it works: a value of 1 means That's probably not
Re: [HACKERS] PL/pgSQL 2
On Tue, Sep 2, 2014 at 4:48 PM, Joshua D. Drake j...@commandprompt.com wrote: On 09/02/2014 09:48 AM, Bruce Momjian wrote: As a case in point, EDB have spent quite a few man-years on their Oracle compatibility layer; and it's still not a terribly exact match, according to my colleagues who have looked at it. So that is a tarbaby I don't personally care to touch ... even ignoring the fact that cutting off EDB's air supply wouldn't be a good thing for the community to do. What any commercial entity and the Community do are mutually exclusive and we can not and should not determine what features we will support based on any commercial endeavor. From where I sit the mutually exclusive argument doesn't seem to be true - and in fact is something I think would be bad if it were. We shouldn't be afraid to add features to core that vendors are offering but at the same time the fact that the Oracle compatibility aspects are commercial instead of in-core is a plus to help ensure that there are people making a decent living off PostgreSQL and thus are invested in its future - and directly enticed to improve our product in order to get them more converts. I don't believe the community wants to compete on that basis nor does necessarily standardizing the layer and letting the vendors compete on consulting and implementation services seem a strong investment for the community to make. There is no way to consider development plans without considering what the entire eco-system is doing: commercial and community both. A blanket statement like above is a good way to make sure you don't get too carried away with letting commercial vendors provide things that should be in core; but at the same time the hurdle becomes higher if those features can be had commercially. My $0.02 David J.
Re: [HACKERS] PL/pgSQL 2
On Mon, Sep 1, 2014 at 11:12 PM, Craig Ringer cr...@2ndquadrant.com wrote: On 09/02/2014 09:40 AM, David G Johnston wrote: Random thought as I wrote that: how about considering how pl/pgsql functionality can be generalize so that it is a database API that another language can call? In that way the server would drive the core functionality and the language would simply be an interpreter that enforces its specific notion of acceptable syntax. That's pretty much what we already have with the SPI and procedural language handler infrastructure. PL/Perl, PL/Python, etc exist because we have this. What do you see as missing from the current infrastructure? What can't be done that should be able to be done in those languages? Yet pl/pgsql does not have to use SPI-interface type calls to interact with PostgreSQL at the SQL level... I don't have an answer to your questions but the one I'm asking is whether a particular language could hide all of the SPI stuff behind some custom syntax so that it in effect looks similar to what pl/pgsql does today? Or, more to the point, does pl/pgsql use the same SPI interface behind the scenes as PL/Perl or does it have its own special interface? David J.
Re: [HACKERS] Built-in binning functions
On Sun, Aug 31, 2014 at 7:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Since bucket is the 'verb' here (in this specific case meaning lookup the supplied value in the supplied bucket definition) and width is a modifier (the bucket specification describes an equal-width structure) I suggest literal_bucket(val, array[]) such that the bucket is still the verb but now the modifier describes a structure that is literally provided. It's a very considerable stretch to see bucket as a verb here :-). Maybe that's why the SQL committee's choice of function name seems so unnatural (to me anyway). I was wondering about bucket_index(), ie get the index of the bucket this value falls into. Or get_bucket(), or get_bucket_index() if you like verbosity. regards, tom lane I got stuck on the thought that a function name should ideally be/include a verb... Even if you read it as a noun (and thus the verb is an implied get) the naming logic still holds. I pondered a get_ version though the argument for avoiding conflicting user-code decreases its appeal. The good part about SQL standard naming is that the typical programmer isn't likely to pick a conflicting name. bucket_index is appealing by itself though user-code probable...as bad as I think width_bucket is for a name the fact is that it currently exists and, even forced, consistency has merit. David J.
Re: [HACKERS] Fixed redundant i18n strings in json
On Thu, Aug 7, 2014 at 5:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote Surely that was meant to read invalid number OF arguments. The errhint is only charitably described as English, as well. I'd suggest something like Arguments of json_build_object() must be pairs of keys and values. --- but maybe someone else can phrase that better. The user documentation is worth emulating here: http://www.postgresql.org/docs/9.4/interactive/functions-json.html errmsg(argument count must be divisible by 2) errhint(The argument list consists of alternating names and values) Seems reasonable to me. Note that I s/keys/names/ to match said documentation. Hm. The docs aren't too consistent either: there are several other nearby places that say keys. Notably, the functions json[b]_object_keys() have that usage embedded in their names, where we can't readily change it. I'm inclined to think we should s/names/keys/ in the docs instead. Thoughts? Agreed - have the docs match the common API term usage in our implementation. Not sure its worth a thorough hunt but at least fix them as they are noticed. David J.
Re: [HACKERS] Re: Proposed changing the definition of decade for date_trunc and extract
On Fri, Aug 1, 2014 at 8:15 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: On 02/08/14 12:32, David G Johnston wrote: Any supporting arguments for 1-10 = 1st decade other than technical perfection? I guess if you use data around and before 1AD you care about this more, and rightly so, but given sound arguments for both methods the one more useful to more users who I suspect dominantly care about years 1900. So -1 to change for breaking backward compatibility and -1 because the current behavior seems to be more useful in everyday usage. Since there was no year zero: then it follows that the first decade comprises years 1 to 10, and the current Millennium started in 2001 - or am I being too logical??? :-) This is SQL, only relational logic matters. All other logic can be superseded by committee consensus. IOW - and while I have no way of checking - this seems like something that may be governed by the SQL standard...in which case adherence to that would trump mathematical logic. David J.
Re: [HACKERS] Is there a way to temporarily disable a index
On Fri, Jul 11, 2014 at 12:12 PM, Michael Banck mba...@gmx.net wrote: On Fri, Jul 11, 2014 at 11:07:21AM -0400, Tom Lane wrote: David G Johnston david.g.johns...@gmail.com writes: Benedikt Grundmann wrote That is it possible to tell the planner that index is off limits i.e. don't ever generate a plan using it? Catalog hacking could work but not recommended (nor do I know the proper commands and limitations). Do you need the database/table to accept writes during the testing period? Hacking pg_index.indisvalid could work, given a reasonably recent PG. I would not try it in production until I'd tested it ;-) I wonder whether this should be exposed at the SQL level? Hacking pg_index is left to superusers, but the creator of an index (or the owner of the schema) might want to experiment with disabling indices while debugging query plans as well. Turns out this is already in the TODO, Steve Singer has requested this (in particular, ALTER TABLE ... ENABLE|DISABLE INDEX ...) in http://www.postgresql.org/message-id/87hbegz5ir@cbbrowne.afilias-int.info (as linked to from the TODO wiki page), but the neighboring discussion was mostly about FK constraints. Thoughts? Michael Apparently work is ongoing on to allow EXPLAIN to calculate the impact a particular index has on table writes. What is needed is a mechanism to temporarily facilitate the remove impact of specific indexes on reads without having to disable the index for writing. Ideally on a per-query basis so altering the catalog doesn't make sense. I know we do not want traditional planner hints but in the spirit of the existing enable_indexscan GUC there should be a disable_readofindex='table1.index1,table1.index2,table2.index1' GUC capability that would allow for session, user, or system-level control of which indexes are to be used during table reads. David J.
Re: [HACKERS] PL/pgSQL support to define multi variables once
On Fri, Jun 13, 2014 at 11:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote At the very least I think we should stay away from this syntax until the SQL committee understand it better than they evidently do today. I don't want to implement it and then get caught by a future clarification that resolves the issue differently than we did. Its not quite as unclear as you make it out to be: Yes it is. Not withstanding the decision making of the SQL committee I was rejecting as inconsistent: SET random_1 = 0; SET random_2 = 0; SET random_3 = random(1234); The ambiguity regarding re-execute or copy still remains. That's not the reading I want, and it's not the reading you want either, but there is nothing in the existing text that justifies single evaluation. So I think we'd be well advised to sit on our hands until the committee clarifies that. It's not like there is some urgent reason to have this feature. Agreed. I don't suppose there is any support or prohibition on the : one,two,three integer := generate_series(1,3); interpretation...not that I can actually come up with a good use case that wouldn't be better implemented via a loop in the main body. David J.
Re: [HACKERS] RETURNING PRIMARY KEY syntax extension
On Monday, June 9, 2014, Ian Barwick i...@2ndquadrant.com wrote: On 09/06/14 14:47, David G Johnston wrote: Ian Barwick wrote Hi, The JDBC API provides the getGeneratedKeys() method as a way of retrieving primary key values without the need to explicitly specify the primary key column(s). This is a widely-used feature, however the implementation has significant performance drawbacks. Currently this feature is implemented in the JDBC driver by appending RETURNING * to the supplied statement. However this means all columns of affected rows will be returned to the client, which causes significant performance problems, particularly on wide tables. To mitigate this, it would be desirable to enable the JDBC driver to request only the primary key value(s). ISTM that having a non-null returning clause variable when no returning is present in the command makes things more complicated and introduces unnecessary checks in the not uncommon case of multiple non-returning commands being issued in series. returningList was able to be null and so should returningClause. Then if non-null first check for the easy column listing and then check for the more expensive PK lookup request. Then again the extra returning checks may just amount noise. David J.
Re: [HACKERS] 9.4 release notes
On Mon, May 19, 2014 at 10:23 AM, Bruce Momjian br...@momjian.us wrote: On Thu, May 15, 2014 at 06:08:47PM -0700, David G Johnston wrote: Some errors and suggestions - my apologizes for the format as I do not have a proper patching routine setup. Sorry, let me address some items I skipped on your list: IIUC: Logical decoding allows for streaming of statement-scoped database changes. I think Logical decoding does more than statement-scoped database changes, e.g. it can enable multi-master without triggers. I am hesitant to mention specific items in the release notes for that reason. Yeah, probably need to look at this as a bigger unit of work and better understand it first. But to be optionally streamed in a configurable format seems too vague. IIUC: Remove line length restrictions from pgbench. Uh, there is a specific place we removed the ne length restriction in pg_bench. I simply interpreted: Allow pgbench to process script files of any line length (Sawada Masahiko) The previous line limit was BUFSIZ. script files of any line length just doesn't sound right to me; and if my re-wording is wrong then it also does not actually communicate what was changed very well. style: add comma - Previously[,] empty arrays were returned (occurs frequently but is minor) Thanks for the comma adjustments --- I can't decide on that sometimes. Not a grammar expert by any means but there are generally two uses of previously. He previously removed that from the build. Previously, he removed that from the build. In the first case previously simply modifies removed while in the second case previously modifies the entire fragment and thus acts as a transition word. In the second case you want the comma in the first you do not. In most cases a leading previously will be of the second form and so wants the comma. As an aside the original wording of the above example would imply that a non-empty array was returned since a previously empty array is one that now has content. FYI, the most frequently updated doc build is here: http://momjian.us/pgsql_docs/release-9-4.html -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + David J.
Re: [HACKERS] 9.4 release notes
On Mon, May 19, 2014 at 12:36 AM, Bruce Momjian br...@momjian.us wrote: On Thu, May 15, 2014 at 06:08:47PM -0700, David G Johnston wrote: Some errors and suggestions - my apologizes for the format as I do not have a proper patching routine setup. Patch Review - Top to Bottom (mostly, I think...) I have made your suggested adjustments in the attached applied patch. Add ROWS FROM() syntax to allow horizontal concatenation of set-returning functions in the FROM-clause (Andrew Gierth) - Maybe a note about using this to avoid least-common-multiple expansion? Sorry, I do not understand this. Apparently, neither do I. I think I was confusing this with set-returning functions in the select-list. In the from-clause comma-separated functions are combined using a cross-join, not LCM-expansion. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + David J.
Re: [HACKERS] DISCARD ALL (Again)
On Thursday, April 17, 2014, Joshua D. Drake j...@commandprompt.com wrote: On 04/17/2014 07:07 PM, David G Johnston wrote: On 04/17/2014 05:24 PM, Tom Lane wrote: On the whole I'm not sure this is something we ought to get into. If you really need a fresh session, maybe you should start a fresh session. Isn't the whole point to avoid the reconnection overhead, especially for connection poolers? DISCARD ALL shouldn't cause any cleanup that wouldn't otherwise occur when a session disconnects. True global data (not just session global) should be excluded. The GD is global to the session only (Like temp tables). Yes. Tom's response makes it sound like the proposal is to throw away the entire language environment for the whole server (thus needing super user privilege) so I'm pointing out that what we are discussing is not that invasive. A better wording of the promise would be: discard all leaves the session in the same state it would be in if the underlying connection were dropped and re-established. Except that it doesn't. But is this what you intend it to mean, by implementing these features, or are you thinking something different? David J.
Re: [HACKERS] polymorphic SQL functions has a problem with domains
Tom Lane-2 wrote Pavel Stehule lt; pavel.stehule@ gt; writes: I was informed about impossibility to use a polymorphic functions together with domain types see create domain xx as numeric(15); create or replace function g(anyelement, anyelement) returns anyelement as $$ select $1 + $2 $$ language sql immutable; postgres=# select g(1::xx, 2::xx); ERROR: return type mismatch in function declared to return xx DETAIL: Actual return type is numeric. CONTEXT: SQL function g during inlining That example doesn't say you can't use polymorphic functions with domains. It says that this particular polymorphic function definition is wrong: it is not making sure its result is of the expected data type. I don't recall right now whether SQL functions will apply an implicit cast on the result for you, but even if they do, an upcast from numeric to some domain over numeric wouldn't be implicit. How would that be possible though? Since any number of domains could be defined over numeric as soon as the + operator causes the domain to be lost there is no way to get it back manually - you cannot just make it SELECT ($1 + $2)::xx. Does something like: SELECT ($1 + $2)::$1%TYPE exist where you can explicitly cast to the type of the input argument? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/polymorphic-SQL-functions-has-a-problem-with-domains-tp5798349p5798356.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] polymorphic SQL functions has a problem with domains
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: Does something like: SELECT ($1 + $2)::$1%TYPE exist where you can explicitly cast to the type of the input argument? I don't think SQL-language functions have such a notation, but it's possible in plpgsql, if memory serves. Indeed. http://www.postgresql.org/docs/9.3/interactive/plpgsql-declarations.html#PLPGSQL-DECLARATION-TYPE Section 40.3.3 You lose inlining but at least it (should) work. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/polymorphic-SQL-functions-has-a-problem-with-domains-tp5798349p5798367.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyData dont signal error
steve k wrote Am I to understand then that I should expect no error feedback if copy fails because of something like attempting to insert alphabetic into a numeric? I apologize for my ignorance, but all my return codes were always successful (PGRES_COMMAND_OK) even if nothing was copied due to garbage data. Also, calling PQgetResult never returned any information either because everything was always PGRES_COMMAND_OK. If that's what is supposed to happen then I have completely missed the boat and apologize for wasting everyone's time. In your example you successfully sent an error message to the server and so PQputCopyEnd does not report an error since it did what it was asked to do. Later, when you finish the copy and ask for the error message, you probably will get the same message that you sent here but you may get a different one depending on whether the server encountered any other errors before you sent the explicit error. Regardless of the message you are guaranteed to get back an error after calling PQgetResult. It seems to me that you need to supply a simple C program - along with a test file that you expect to fail - that never reports an error so that others may evaluate actual code. The likelihood of this NOT being human error is small so we really have to see your actual code since that is most probably the culprit. Note that, as Tom mentioned, psql is open source. If you are trying to duplicate its behavior in your own code you should probably look to see what it does. The fact that you get the proper errors in some cases means that the server is quite obviously capable of working in the manner you desire and thus the client - again your specific software - it where any issue likely resides. Again, from what I'm reading PQputCopy(Data|End) will not report on data parsing issues. You will only see those after issuing PQgetResult - which is noted in the last paragraph of the PQputCopyEnd documentation excerpt you provided. The put commands only report whether the sending of the data was successful. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798036.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyData dont signal error
steve k wrote I started with this: DBInsert_excerpts6_test_cpdlc.cpp http://postgresql.1045698.n5.nabble.com/file/n5798049/DBInsert_excerpts6_test_cpdlc.cpp Can you point out to me where in that code you've followed this instruction from the documentation: After successfully calling PQputCopyEnd, call PQgetResult to obtain the final result status of the COPY command. One can wait for this result to be available in the usual way. Then return to normal operation. Since I cannot seem to find it. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798077.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyData dont signal error
steve k wrote Sorry I can't provide more information but I do appreciate your time. If you can't get any further with it I understand and don't expect another reply. For the benefit of others I'm reading this as basically you've found a better way to do this so you are no longer concerned with correcting the broken (incomplete) code you have provided. It should be as simple as adding one more if statement between the copy-end check and the overall failure check to see whether the copy command itself failed in addition to the existing checks to see if sending the data or ending the data send failed. I will not pretend to write c code but the general logic and requirements seems quite clear from the documentation I have read/shown. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798115.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql \d+ and oid display
Bruce Momjian wrote On Fri, Mar 28, 2014 at 03:53:32PM -0300, Fabrízio de Royes Mello wrote: On Fri, Mar 28, 2014 at 3:41 PM, Tom Lane lt; tgl@.pa gt; wrote: Bruce Momjian lt; bruce@ gt; writes: On Thu, Mar 27, 2014 at 02:54:26PM -0400, Stephen Frost wrote: I believe Bruce was suggesting to show it when it is set to *not* the default, which strikes me as perfectly reasonable. We seem to be split on the idea of having Has OIDs display only when the oid status of the table does not match the default_with_oids default. FWIW, I think that having the display depend on what that GUC is set to is a seriously *bad* idea. It will mean that you don't actually know, when looking at the output of \d, whether the table has OIDs or not. I could get behind a proposal to suppress the line when there are not OIDs, full stop; that is, we print either Has OIDs: yes or nothing. But I think this patch just makes things even more surprising when default_with_oids is turned on. Something like the attached ? I assume it would be more like my attachment, i.e. since we are only displaying it when OIDs exist, there is no value for oid status field --- just say Has OIDs or Includes OIDs, or something like that. I know some people are saying there is no need to change the current output --- I am only saying that the importance of showing the lack of OIDs has lessened over the years, and we should reconsider its importance. If we reconsider and still think we are fine, that's good with me. I am saying we should not just keep doing this because we have always displayed it in the past. As my belief is that 99% of the uses of \d are for human consumption (because machines should in most cases hit the catalogs directly) then strictly displaying Includes OIDs when appropriate has my +1. Uses of \d+ in regression suites will be obvious and quickly fixed and likely account for another 0.9%. psql backslash commands are not machine API contracts and should be adapted for optimal human consumption; thus neutering the argument for maintaining backward compatibility. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/psql-d-and-oid-display-tp5797653p5797879.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyData dont signal error
steve k wrote I realize this is an old thread, but seems to be the only discussion I can find on this topic I have a problem with PQputCopyData function. It doesn't signal some error. I am using from within a c++ program: PQexec(m_pConn, COPY... ...FROM stdin), followed by PQputCopyData(m_pConn, ssQuery.str().c_str(), ssQuery.str().length()), finished with PQputCopyEnd(m_pConn, NULL). This may be over simplified but... To summarize here the PQexec needs a matching PQgetResult. The PQputCopyEnd only closes the preceding PQputCopyData. The server is not compelled to process the copy data until the client says that no more data is coming. Once the PQputCopyEnd has returned for practical purposes you back to the generic command handling API and need to proceed as you would for any other command - including being prepared to wait for long-running commands and handle errors. The request in this thread is for some means for the client to send partial data and if the server has chosen to process that data (or maybe the client can compel it to start...) AND has encountered an error then the client can simply abort the rest of the copy and return an error. The current API return values deal with the results of the actual call just performed and not with any pre-existing state on the server. Tom's suggestion is to add a polling method to query the current server state for those who care to expend the overhead instead of imposing that overhead on all callers. The C API seems strange to me, a non-C programmer, but at a cursory glance it is at least internally consistent and does provide lots of flexibility (especially for sync/async options) at the cost of complexity and having to make sure that you code in the appropriate PQgetResult checks or risk ignoring errors and reporting success incorrectly. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5797912.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql \d+ and oid display
Bruce Momjian wrote When we made OIDs optional, we added an oid status display to \d+: test= \d+ test Table public.test Column | Type | Modifiers | Storage | Stats target | Description +-+---+-+--+- x | integer | | plain | | -- Has OIDs: no Do we want to continue displaying that OID line, or make it optional for cases where the value doesn't match default_with_oids? If we didn't make it behave this way at the time of the change then what has changed that we should make it behave this way now? I like the logic generally but not necessarily the change. The disadvantage of this change is users (both end and tools) of the data now also have to look at what the default is (or was at the time the text was generated) to know what a suppressed OIDs means. Given how much information the typical \d+ generates I would suspect that the added noise this introduces is quickly ignored by frequent users and not all the disruptive to those who use \d+ infrequently. Tools likely would prefer is to be always displayed. My $0.02 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/psql-d-and-oid-display-tp5797653p5797707.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] History of WAL_LEVEL (archive vs hot_standby)
shamccoy wrote Hello. I've been doing some benchmarks on performance / size differences between actions when wal_level is set to either archive or hot_standby. I'm not seeing a ton of difference. I've read some posts about discussions as to whether this parameter should be simplified and remove or merge these 2 values. I'd like to understand the historic reason between have the extra hot_standby value. Was it to introduce replication and not disturb the already working archive value? If I'm new to Postgres, is there any strategic reason to use archive at this point if replication is something I'll be using in the future? I'm not seeing any downside to hot_standby unless I'm missing something fundamental. Thanks, Shawn There is a semantic difference in that archive is limited to wal shipping to a dead-drop area for future PITR. hot_standby implies that the wal is being sent to another running system that is immediately reading in the information to maintain an exact live copy of the master. As I think both can be used for PITR I don't believe there is much downside, technically or with resources, to using hot_standby instead of archive; but I do not imagine it having any practical benefit either. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/History-of-WAL-LEVEL-archive-vs-hot-standby-tp5797717p5797720.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] History of WAL_LEVEL (archive vs hot_standby)
Noah Misch-2 wrote On Thu, Mar 27, 2014 at 03:06:02PM -0700, David Johnston wrote: shamccoy wrote Hello. I've been doing some benchmarks on performance / size differences between actions when wal_level is set to either archive or hot_standby. I'm not seeing a ton of difference. I've read some posts about discussions as to whether this parameter should be simplified and remove or merge these 2 values. I'd like to understand the historic reason between have the extra hot_standby value. Was it to introduce replication and not disturb the already working archive value? I think so. If I'm new to Postgres, is there any strategic reason to use archive at this point if replication is something I'll be using in the future? I'm not seeing any downside to hot_standby unless I'm missing something fundamental. Probably not. You might manage to construct a benchmark in which the extra master-side work is measurable, but it wouldn't be easy. There is a semantic difference in that archive is limited to wal shipping to a dead-drop area for future PITR. hot_standby implies that the wal is being sent to another running system that is immediately reading in the information to maintain an exact live copy of the master. As I think both can be used for PITR I don't believe there is much downside, technically or with resources, to using hot_standby instead of archive; but I do not imagine it having any practical benefit either. wal_level=archive vs. hot_standby is orthogonal to the mechanism for transmitting WAL and time of applying WAL. Rather, it dictates whether a PostgreSQL cluster replaying that WAL can accept read-only connections during recovery. You can send wal_level=archive log data over streaming replication, even synchronous streaming replication. However, any standby will be unable to accept connections until failover ends recovery. On the flip side, if you use wal_level=hot_standby, a backup undergoing PITR can accept read-only connections while applying years-old WAL from an archive. That is useful for verifying, before you end recovery, that you have replayed far enough. Went looking for this in the docs... http://www.postgresql.org/docs/9.3/interactive/runtime-config-wal.html#GUC-WAL-LEVEL So I guess, no-restore/offline/online would be good names (and maybe wal_restore_mode instead of wal_level) if we started from scratch. Note that no-restore does not preclude same-system recovery. Just something to help me remember... David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/History-of-WAL-LEVEL-archive-vs-hot-standby-tp5797717p5797733.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] History of WAL_LEVEL (archive vs hot_standby)
David Johnston wrote Noah Misch-2 wrote On Thu, Mar 27, 2014 at 03:06:02PM -0700, David Johnston wrote: shamccoy wrote Hello. I've been doing some benchmarks on performance / size differences between actions when wal_level is set to either archive or hot_standby. I'm not seeing a ton of difference. I've read some posts about discussions as to whether this parameter should be simplified and remove or merge these 2 values. I'd like to understand the historic reason between have the extra hot_standby value. Was it to introduce replication and not disturb the already working archive value? I think so. If I'm new to Postgres, is there any strategic reason to use archive at this point if replication is something I'll be using in the future? I'm not seeing any downside to hot_standby unless I'm missing something fundamental. Probably not. You might manage to construct a benchmark in which the extra master-side work is measurable, but it wouldn't be easy. There is a semantic difference in that archive is limited to wal shipping to a dead-drop area for future PITR. hot_standby implies that the wal is being sent to another running system that is immediately reading in the information to maintain an exact live copy of the master. As I think both can be used for PITR I don't believe there is much downside, technically or with resources, to using hot_standby instead of archive; but I do not imagine it having any practical benefit either. wal_level=archive vs. hot_standby is orthogonal to the mechanism for transmitting WAL and time of applying WAL. Rather, it dictates whether a PostgreSQL cluster replaying that WAL can accept read-only connections during recovery. You can send wal_level=archive log data over streaming replication, even synchronous streaming replication. However, any standby will be unable to accept connections until failover ends recovery. On the flip side, if you use wal_level=hot_standby, a backup undergoing PITR can accept read-only connections while applying years-old WAL from an archive. That is useful for verifying, before you end recovery, that you have replayed far enough. Went looking for this in the docs... http://www.postgresql.org/docs/9.3/interactive/runtime-config-wal.html#GUC-WAL-LEVEL So I guess, no-restore/offline/online would be good names (and maybe wal_restore_mode instead of wal_level) if we started from scratch. Note that no-restore does not preclude same-system recovery. Just something to help me remember... David J. Slightly tangential but are the locking operations associated with the recent bugfix generated in both (all?) modes or only hot_standby? I thought it strange that transient locking operations were output with WAL though I get it if they are there to support read-only queries. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/History-of-WAL-LEVEL-archive-vs-hot-standby-tp5797717p5797735.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minimum supported version of Python?
Peter Eisentraut-2 wrote On 3/18/14, 11:22 AM, Andrew Dunstan wrote: Actually, if you run a buildfarm animal you have considerable control over what it tests. I appreciate that. My problem here isn't time or ideas or coding, but lack of hardware resources. If I had hardware, I could set up tests for every build dependency under the sun. I don't imagine there is enough churn in this area that having a constantly testing buildfarm animal is a strong need; but a wiki page dedicated to Python Support in PostgreSQL where we can publicly and officially release testing results and commentary would be an improvement. As it sounds like the only caveat to supporting 2.3 is that we don't technically (or do we - is Decimal mandatory?) support an un-modified core installation but require that at one add-on module be installed. Assuming that clears up all the errors Tom is seeing then saying that plpythonu works with a slightly modified 2.3 on all current releases of PostgreSQL isn't a stretch nor does it commit us to fixing bugs in the unlikely event that any are discovered in two extremely stable environments. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Minimum-supported-version-of-Python-tp5796175p5796615.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planner hints in Postgresql
Atri Sharma wrote On Mon, Mar 17, 2014 at 9:22 PM, Rajmohan C lt; csrajmohan@ gt; wrote: I am implementing Planner hints in Postgresql to force the optimizer to select a particular plan for a query on request from sql input. I am having trouble in modifying the planner code. I want to create a path node of hint plan and make it the plan to be used by executor. How do I enforce this ? Should I create a new Plan for this ..how to create a plan node which can be then given directly to executor for a particular query? Planner hints have been discussed a lot before as well and AFAIK there is a wiki page that says why we shouldnt implement them. Have you referred to them? Please share if you have any new points on the same. Regards, Atri http://wiki.postgresql.org/wiki/Todo (I got to it via the FAQ link on the homepage and the Developer FAQ section there-in. You should make sure you've scanned that as well.) Note the final section titled: Features We Do Not Want Also, you need to consider what you are doing when you cross-post (a bad thing generally) -hackers and -novice. As there is, rightly IMO, no -novice-hackers list you should have probably just hit up -general. Need to discuss the general why before any meaningful help on the how is going to be considered by hackers. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Planner-hints-in-Postgresql-tp5796347p5796353.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planner hints in Postgresql
Atri Sharma wrote On Mon, Mar 17, 2014 at 9:45 PM, Tom Lane lt; tgl@.pa gt; wrote: David Johnston lt; polobo@ gt; writes: Need to discuss the general why before any meaningful help on the how is going to be considered by hackers. Possibly worth noting is that in past discussions, we've concluded that the most sensible type of hint would not be use this plan at all, but here's what to assume about the selectivity of this WHERE clause. That seems considerably less likely to break than any attempt to directly specify plan details. Isnt using a user given value for selectivity a pretty risky situation as it can horribly screw up the plan selection? Why not allow the user to specify an alternate plan and have the planner assign a higher preference to it during plan evaluation? This shall allow us to still have a fair evaluation of all possible plans as we do right now and yet have a higher preference for the user given plan during evaluation? The larger question to answer first is whether we want to implement something that is deterministic... How about just dropping the whole concept of hinting and provide a way for someone to say use this plan, or die trying. Maybe require it be used in conjunction with named PREPAREd statements: PREPARE s1 (USING /path/to/plan_def_on_server_or_something_similar) AS SELECT ...; Aside from whole-plan specification I can definitely see where join/where specification could be useful if it can overcome the current limitation of not being able to calculate inter-table estimations. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Planner-hints-in-Postgresql-tp5796347p5796378.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wiki Page Draft for upcoming release
I sent a post to -general with a much more detailed brain dump of my current understanding on this topic. The main point I'm addressing here is how to recover from this problem. Since a symptom of the problem is that pg_dump/restore can fail saying that (in some instances) the only viable restore mechanism be pg_dump/restore means that someone so afflicted is going to lose data since their last good dump - if they still have one. However, if the true data table does not actually contain any duplicate data then such a dump/restore cycle (or I would think REINDEX - or DROP/CREATE INDEX chain) should resolve the problem. Thus if there is duplicate data the user needs-to/can identify and remove the offending records so that subsequent actions do not fail with a duplicate key error. If this is true then providing a query (or queries) that can provide the problem records and delete them from the table - along with any staging up that is necessary (like first dropping affected indexes if applicable) - would be good a nice addition. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Wiki-Page-Draft-for-upcoming-release-tp5796494p5796503.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wiki Page Draft for upcoming release
Josh Berkus wrote All, https://wiki.postgresql.org/wiki/20140320UpdateIssues I'm sure my explanation of the data corruption issue is not correct, so please fix it. Thanks! I presume that because there is no way the master could have sent bad table data to the replication slaves that performing the base backup on the slaves is sufficient; i.e., it is not necessary to backup the master and distribute that? I'd either make the change myself or ask this kind of question somewhere else more appropriate but I'm not sure of the proper protocol to follow. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Wiki-Page-Draft-for-upcoming-release-tp5796494p5796505.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minimum supported version of Python?
Peter Eisentraut-2 wrote On Sat, 2014-03-15 at 20:55 -0400, Tom Lane wrote: Our documentation claims that the minimum Python version for plpython is 2.3. However, an attempt to build with that on an old Mac yielded a bunch of failures in the plpython_types regression test, It has frequently been the case that the last supported version does not fully pass the regression test, because of the overhead of maintaining variant files. The last supported version is the one that compiles and works. You will note that 2.2 no longer compiles. (It also failed the regression tests for a while before it started not compiling.) Typically, versions fall out of support because we add new functionality that the old Python versions cannot support anymore. all of the form ! ERROR: could not import a module for Decimal constructor ! DETAIL: ImportError: No module named decimal You can make this work by manually installing the decimal module (because it was not part of the core in Python 2.3). Otherwise, this test result legitimately alerts you that some feature is not fully working and that you need to adjust your installation. It would seem a single error (or warning, depending) like Missing Dependency at the start and skipping all tests dependent on the error would be cleaner. So that if someone goes an blindly starts running a clean 2.3 regression test they are not drowned in a sea of errors and no clear documentation as to why and how to resolve. Now, odds are the assumption that Decimal is present is spread through the code-base so maybe something less invasive? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Minimum-supported-version-of-Python-tp5796175p5796512.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minimum supported version of Python?
Joshua D. Drake wrote On 03/17/2014 07:31 PM, Peter Eisentraut wrote: On Sun, 2014-03-16 at 22:34 -0400, Tom Lane wrote: Well, if you want to consider python 2.3 as supported, I have a buildfarm machine I am about to put online that has 2.3 on it. If I spin it up with python enabled, I expect you to see to it that it starts passing. If you won't do that, I'm going to change the documentation. As I said, according to my testing, 2.3 is supported. If your experience is different, then please submit a reproducible bug report. As for 2.4 vs 2.5, I don't have a lot of faith that we're really supporting anything that's not represented in the buildfarm... There are many other features that the build farm doesn't test and that I don't have a lot of faith in, but I'm not proposing to remove those. I don't control what the build farm tests, I only control my own work. We shouldn't be supporting anything the community doesn't support. Python 2.3 is dead. We shouldn't actively support it nor suggest that we could or should via the docs. There is certainly an argument for Python 2.4 (due to CentOS/RHEL) but other than that... really? JD +1 generally though it ignores the bigger question which is how are we actually defining and proving support; and also the whole we support 8.4 (for a little longer now) so what related technology should we be either supporting or at least saying ya know, this was working just fine 4 years ago if you use exactly these versions, so use those versions. A last know working version in the plpythonu section wouldn't hurt since actively running all the possible combinations for the 5 year support cycle doesn't make sense. Is Peter /the/ final arbiter of fact in this area or are there some public buildfarm animals out there that have been designated to fulfill this role? Peter claims to be doing the compiling and testing for these, which is great and would be better if people like Tom actually were aware of that fact, but comments suggest that such testing is currently done offline. To Peter's comment - reporting which version combinations are known to compile is good but if a regression test exists then it should either pass on a clean version or it should be documented (ideally right in the tests themselves) what additional configuration is required (though maybe moving that back to configure would be sufficient for our needs). If there are no coverage tests we are not asserting any proof anyway so if a release compiles and passes regression tests it should be noted as such. Once the regressions start failing without any known fix it's hard to understand how we could say such a combination is valid/supported anymore... Supported - we will endeavor to fix bugs. add the test case and then fix the problem. Compiles/Tests-OK - we know it worked at one point but if we discover a bug we're more than likely to just say that the version is no longer supported and known to have bugs. Add the test case that causes regression to fail and move on. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Minimum-supported-version-of-Python-tp5796175p5796516.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is this a bug
fabriziomello wrote On Thu, Mar 13, 2014 at 10:34 AM, Euler Taveira lt; euler@.com gt; wrote: On 13-03-2014 00:11, Fabrízio de Royes Mello wrote: Shouldn't the ALTER statements below raise an exception? For consistency, yes. Who cares? I mean, there is no harm in resetting an unrecognized parameter. Have in mind that tighten it up could break scripts. In general, I'm in favor of validating things. I know this could break scripts, but I think a consistent behavior should be raise an exception when an option doesn't exists. euler@euler=# reset noname; ERROR: 42704: unrecognized configuration parameter noname LOCAL: set_config_option, guc.c:5220 This is a consistent behavior. Regards, Probably shouldn't back-patch but a fix and release comment in 9.4 is warranted. Scripts resetting invalid parameters are probably already broken, they just haven't discovered their mistake yet. Do we need an IF EXISTS feature on these as well? ;) David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Is-this-a-bug-tp5795831p5795943.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY table FROM STDIN doesn't show count tag
Tom Lane-2 wrote Unfortunately, while testing it I noticed that there's a potentially fatal backwards-compatibility problem, namely that the COPY n status gets printed on stdout, which is the same place that COPY OUT data is going. While this isn't such a big problem for interactive use, usages like this one are pretty popular: psql -c 'copy mytable to stdout' mydatabase | some-program With the patch, COPY n gets included in the data sent to some-program, which never happened before and is surely not what the user wants. The same if the -c string uses \copy. There are several things we could do about this: 1. Treat this as a non-backwards-compatible change, and document that people have to use -q if they don't want the COPY tag in the output. I'm not sure this is acceptable. I've mostly used copy to with files and so wouldn't mind if STDOUT had the COPY n sent to it as long as the target file is just the copy contents. 2. Kluge ProcessResult so that it continues to not pass back a PGresult for the COPY TO STDOUT case, or does so only in limited circumstances (perhaps only if isatty(stdout), for instance). The main problem with this is that people will test by sending output to a TTY and see the COPY n. Although if it can be done consistently then you minimize backward incompatibility and encourage people to enforce quiet mode while the command runs... 3. Modify PrintQueryStatus so that command status goes to stderr not stdout. While this is probably how it should've been done in the first place, this would be a far more severe compatibility break than #1. (For one thing, there are probably scripts out there that think that any output to stderr is an error message.) I'm afraid this one is definitely not acceptable, though it would be by far the cleanest solution were it not for compatibility concerns. Yes, it's a moot point but I'm not sure it would be best anyway. 4. As #3, but print the command status to stderr only if it's COPY n, otherwise to stdout. This is a smaller compatibility break than #3, but still a break since COPY status was formerly issued to stdout in non TO STDOUT/FROM STDIN cases. (Note that PrintQueryStatus can't tell whether it was COPY TO STDOUT rather than any other kind of COPY; if we want that to factor into the behavior, we need ProcessResult to do it.) Since we are considering stderr my (inexperienced admittedly) gut says that using stderr for this is generally undesirable and especially given our existing precedence. stdout is the seemingly correct target, typically, and the existing quiet-mode toggle provides sufficient control for typical needs. 5. Give up on the print-the-tag aspect of the change, and just fix the wrong-line-number issue (so we'd still introduce the copyStream variable, but not change how PGresults are passed around). I'm inclined to think #2 is the best answer if we can't stomach #1. But the exact rule for when to print a COPY OUT result probably still requires some debate. Or maybe someone has another idea? Also, I'm thinking we should back-patch the aspects of the patch needed to fix the wrong-line-number issue. That appears to have been introduced in 9.2; older versions of PG get the above example right. Comments? I'd like COPY TO to anything but STDOUT to emit a COPY n on STDOUT - unless suppressed by -q(uiet) Document that COPY TO STDOUT does not emit COPY n because STDOUT is already assigned for data and so is not available for notifications. Since COPY is more typically used for ETL than a bare-select, in addition to back-compatibility concerns, this default behavior seems reasonable. Would it be possible to store the n somewhere and provide a command - like GET DIAGNOSTICS in pl/pgsql - if the user really wants to know how many rows were sent to STDOUT? I'm doubt this is even useful in the typical use-case for COPY TO STDOUT but figured I'd toss the idea out there. Is there anything besides a desire for consistency that anyone has or can put forth as a use-case for COPY TO STDOUT emitting COPY n on STDOUT as well? If you are going to view the content inline, and also want a quick count, ISTM you would be more likely to use SELECT to take advantage of all its pretty-print features. If we really need to cater to this use then maybe a --loud-copy-to-stdout switch can be provided to override its default quiet-mode. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/COPY-table-FROM-STDIN-doesn-t-show-count-tag-tp5775018p5795611.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Bug: Fix Wal replay of locking an updated tuple (WAS: Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6)
Joshua D. Drake wrote On 03/12/2014 06:15 PM, Tom Lane wrote: Robert Haas lt; robertmhaas@ gt; writes: Discuss. This thread badly needs a more informative Subject line. No kidding. Or at least a link for goodness sake. Although the pgsql-packers list wasn't all that helpful either. A link would be nice though if -packers is a security list then that may not be a good thing since -hackers is public... A quick search of Nabble and the Mailing Lists section of the homepage do not indicate pgsql-packers exists - at least not in any publicly (even if read-only) accessible way. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/9a57858f1103b89a5674f0d50c5fe1f756411df6-tp5795816p5795827.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The case against multixact GUCs
Josh Berkus wrote Hackers, In the 9.3.3 updates, we added three new GUCs to control multixact freezing. This was an unprecented move in my memory -- I can't recall ever adding a GUC to a minor release which wasn't backwards compatibility for a security fix. This was a mistake. It probably should have been handled better but the decision to make these parameters visible itself doesn't seem to be the wrong decision - especially when limited to a fairly recently released back-branch. What makes these GUCs worse is that nobody knows how to set them; nobody on this list and nobody in the field. Heck, I doubt 1 in 1000 of our users (or 1 in 10 people on this list) know what a multixact *is*. That isn't a reason in itself to not have the capability if it is actually needed. Further, there's no clear justification why these cannot be set to be the same as our other freeze ages (which our users also don't understand), or a constant calculated portion of them, or just a constant. Since nobody anticipated someone adding a GUC in a minor release, there was no discussion of this topic that I can find; the new GUCs were added as a side effect of fixing the multixact vacuum issue. Certainly I would have raised a red flag if the discussion of the new GUCs hadn't been buried deep inside really long emails. The release documentation makes a pointed claim that the situation WAS that the two were identical; but the different consumption rates dictated making the multi-xact configuration independently configurable. So in effect the GUC was always present - just not user-visible. Even if there are not any current best practices surrounding this topic at least this way as methods are developed there is an actual place to put the derived value. As a starting point one can simply look at the defaults and, if they have change the value for the non-multi value apply the same factor to the custom multi-version. Now, obviously someone has to think to actually do that - and the release notes probably should have provided such guidance - but as I state explicitly below the issue is more about insufficient communication and education and less about providing the flexibility. Adding new GUCs which nobody has any idea how to set, or can even explain to new users, is not a service to our users. These should be removed. Or we should insist that those few that do have an understanding create some kind of wiki document, or even a documentation section, to educate those that are not as knowledgeable in this area. For good reason much of the recent focus in this area has been actually getting the feature to work. Presuming that it is a desirable feature - which it hopefully is given it made it into the wild - to have then such focus has obviously been necessary given the apparent complexity of this feature (as evidenced by the recent serious bug reports) but hopefully the feature itself is mostly locked down and education will begin. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/The-case-against-multixact-GUCs-tp5795561p5795573.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] db_user_namespace a temporary measure
Andrew Dunstan wrote On 03/11/2014 11:50 PM, Jaime Casanova wrote: On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane lt; tgl@.pa gt; wrote: But not sure how to define a unique index that allows (joe, db1) to coexist with (joe, db2) but not with (joe, 0). and why you want that restriction? when you login you need to specify the db, right? if you don't specify the db then is the global 'joe' that want to login You can't login without specifying a db. Every session is connected to a db. cheers andrew Random Thoughts: So if dave is already a user in db1 only that specific dave can be made a global user - any other dave would be disallowed. Would user - password be a better PK? Even with the obvious issue that password part of the key can change. user-password to database is a properly many-to-many relationship. Or see next for something simpler. A simple implementation would simply have the global users copied into each database as it is constructed. There would also be a link from each of the database-specific users and the global master so that a password change issued against the global user propagates to all the database-specific versions. Construction of a new global user would cause an immediate copy-over; which can fail if the simple name is already in use in any of the existing databases. Promoting becomes a problem that would ideally have a solution - but then you are back to needing to distinguish between dave-db1 and dave-db2... Be nice if all users could be global and there would be some way to give them permissions on databases. Or go the opposite extreme and all users are local and the concept of global would have to be added by those desiring it. Basically move user-management at the cluster level to a cluster support application and leave databases free-standing. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/db-user-namespace-a-temporary-measure-tp5795305p5795609.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE TYPE similar CHAR type
mohsencs wrote I want use CREATE TYPE to create one type similar to char. I want to when I create type, then my type behave similar to char: CREATE TABLE test (oneChar char); when I want insert one column with length1 to it, so it gets this error: ERROR: value too long for type character(1) I want my type behave similar this but it behaves similar varchar type. If you can get over the need for using CREATE TYPE you'll find that using CREATE DOMAIN with a check constraint will probably meet your needs perfectly. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/CREATE-TYPE-similar-CHAR-type-tp5794946p5794981.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Equivalence Rules
Ali Piroozi wrote Hi Which equivalence rule from those are listed in email's attachment are implemented in postgresql? where are them? What do you mean by where? The various JOINS and UNION/INTERSECT/DIFFERENCE are all defined capabilities. SQL is not purely relational in nature so some of the mathematical rules may not be exactly implemented as theory would dictate but largely they are present. I would suggesting providing more context as to why you are asking this as people are more likely to provide help at this level of technical depth if they know why; and can frame their answers more appropriately. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Equivalence-Rules-tp5794035p5794069.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Function sugnature with default parameter
salah jubeh wrote Hello, I find default values confusing when a function is overloaded, below is an example. CREATE OR REPLACE FUNCTION default_test (a INT DEFAULT 1, b INT DEFAULT 1, C INT DEFAULT 1) RETURNS INT AS $$ BEGIN RETURN a+b+c; END; $$ LANGUAGE 'plpgsql'; Provide a real use-case for this need and maybe someone will consider it worthwhile to implement. In the meantime use VARIADIC or define only the longest-default signature and provide reasonable default values for all optional arguments. In this case the defaults should be zero, not one. Or don't make any of the arguments DEFAULT and provide as many overloads as you use. The whole point of DEFAULT was to avoid having to do this for simple cases - you are not compelled to use default values if your use-case is too complex for them. Or just avoid overloading and use different names that describe better what the different versions accomplish. Specifying DEFAULT in the function call seems to defeat the point. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Function-sugnature-with-default-parameter-tp5793737p5793739.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplified VALUES parameters
Leon Smith wrote Hi, I'm the maintainer and a primary author of a postgresql client library for Haskell, called postgresql-simple, and I recently investigated improving support for VALUES expressions in this library. As a result, I'd like to suggest two changes to postgresql: 1. Allow type specifications inside AS clauses, for example (VALUES (1,'hello'),(2,'world')) AS update(x int, y text) 2. Have an explicit syntax for representing VALUES expressions which contain no rows, such as VALUES (). (although the precise syntax isn't important to me.) My claim is that these changes would make it simpler for client libraries to properly support parameterized VALUES expressions. If you care, I've included a postscript including a brief background, and a link to my analysis and motivations. At a high-level I don't see how the nature of SQL would allow for either of these things to work. The only reason there even is (col type, col2 type) syntax is because record-returning functions have to have their return type defined during query construction. The result of processing a VALUES clause has to be a normal relation - the subsequent presence of AS simply provides column name aliases because in the common form each column is assigned a generic name during execution. Defining a generic empty-values expression has the same problem in that you have to define how many, with type and name, columns the VALUES expression needs to generate. From what I can see SQL is not going to readily allow for the construction of virtual tables via parameters. You need either make those tables non-virtual (even if temporary) or consolidate them into an ARRAY. In short you - the client library - probably can solve the virtual table problem but you will have to accommodate user-specified typing somehow in order to supply valid SQL to the server. The two common solutions for your specified use-case are either the user creates the needed temporary table and writes the update query to join against that OR they write the generic single-record update statement and then loop over all desired input values - ideally all done within a transaction. In your situation you should automate that by taking your desired syntax and construct a complete script that can then been sent to PostgreSQL. I don't imagine that the need for dynamically specified virtual tables is going to be strong enough for people to dedicate the amount of resources it would take to implement such a capability. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Simplified-VALUES-parameters-tp5793744p5793756.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 'dml' value for log_statement
Sawada Masahiko wrote Hi all, Attaching patch provides new value 'dml' for log_statement. Currently, The server logs modification statements AND data definition statements if log_statement is set 'mod'. So we need to set the 'all' value for log_statement and remove unnecessary information if we would like to log only DML. This patch allows the user to set in detail the setting. The server logs statement only when INSERT. UPDATE, DELETE and TRUNCATE statement are executed. ( same as 'mod' value which does not log the DDL) Comments? Regards, --- Sawada Masahiko -1 I'm not fully versed on what log levels provide what data but if you care about changes to data then ALTER TABLE table is just as important an activity as UPDATE table so mod exhibits the recommended behavior and dml provides something that should be of minimal value. DML by itself has value since monitoring schema changes without worrying about transient data updates is understandable. But a schema-change, by definition, alters data. Maybe further description of why you find mod unsatisfactory would be helpful. Though it is simple enough to just let people use their own judgement David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/dml-value-for-log-statement-tp5790895p5790925.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Patch: regexp_matches variant returning an array of matching positions
Alvaro Herrera-9 wrote Björn Harrtell wrote: I've written a variant of regexp_matches called regexp_matches_positions which instead of returning matching substrings will return matching positions. I found use of this when processing OCR scanned text and wanted to prioritize matches based on their position. Interesting. I didn't read the patch but I wonder if it would be of more general applicability to return more info in a fell swoop a function returning a set (position, length, text of match), rather than an array. So instead of first calling one function to get the match and then their positions, do it all in one pass. (See pg_event_trigger_dropped_objects for a simple example of a function that returns in that fashion. There are several others but AFAIR that's the simplest one.) Confused as to your thinking. Like regexp_matches this returns SETOF type[]. In this case integer but text for the matches. I could see adding a generic function that returns a SETOF named composite (match varchar[], position int[], length int[]) and the corresponding type. I'm not imagining a situation where you'd want the position but not the text and so having to evaluate the regexp twice seems wasteful. The length is probably a waste though since it can readily be gotten from the text and is less often needed. But if it's pre-calculated anyway... My question is what position is returned in a multiple-match situation? The supplied test only covers the simple, non-global, situation. It needs to exercise empty sub-matches and global searches. One theory is that the first array slot should cover the global position of match zero (i.e., the entire pattern) within the larger document while sub-matches would be relative offsets within that single match. This conflicts, though, with the fact that _matches only returns array elements for () items and never for the full match - the goal in this function being parallel un-nesting. But as nesting is allowed it is still possible to have occur. How does this resolve in the patch? SELECT regexp_matches('abcabc','((a)(b)(c))','g'); David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Patch-regexp-matches-variant-returning-an-array-of-matching-positions-tp5789321p5789414.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Patch: regexp_matches variant returning an array of matching positions
Erik Rijkers wrote On Wed, January 29, 2014 05:16, David Johnston wrote: How does this resolve in the patch? SELECT regexp_matches('abcabc','((a)(b)(c))','g'); With the patch: testdb=# SELECT regexp_matches('abcabc','((a)(b)(c))','g'), regexp_matches_positions('abcabc','((a)(b)(c))'); regexp_matches | regexp_matches_positions +-- {abc,a,b,c}| {1,1,2,3} {abc,a,b,c}| {1,1,2,3} (2 rows) The {1,1,2,3} in the second row is an artifact/copy from set-value-function-in-select-list repetition and has nothing to do with the second match. testdb=# SELECT regexp_matches('abcabc','((a)(b)(c))','g'), regexp_matches_positions('abcabc','((a)(b)(c))', 'g'); regexp_matches | regexp_matches_positions +-- {abc,a,b,c}| {1,1,2,3} {abc,a,b,c}| {4,4,5,6} (2 rows) As expected. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Patch-regexp-matches-variant-returning-an-array-of-matching-positions-tp5789321p5789434.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
Andres Freund-3 wrote On 2014-01-06 09:12:03 -0800, Mark Dilger wrote: The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. I don't think the benefit of being able to test tablespaces without restarts comes even close to offsetting the cost of requiring sudo permissions and introducing OS dependencies. E.g. there's pretty much no hope of making this work sensibly on windows. So I'd just leave out that part. Only skimming this thread but even if only a handful of buildfarm animals can run this extended test bundle because of the restrictive requirements it is likely better than discarding them altogether. The main thing in this case is to segregate out this routine so that it has to be invoked explicitly and ideally in a ignore if pre-reqs are missing manner. Increasing the likelihood and frequency of test runs in what is a fairly popular platform and that covers non-OS specific code as well has benefits. As long at it doesn't poison anything else I don't see that much harm coming of it. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-In-core-regression-tests-for-replication-cascading-archiving-PITR-etc-Michael-Paquier-tp5785400p578.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Fixing bug #8228 (set-valued function called in context that cannot accept a set)
Tom Lane-2 wrote I kinda forgot about this bug when I went off on vacation: http://www.postgresql.org/message-id/ E1UnCv4-0007oF-Bo@.postgresql Just to clarify: This patch will cause both executions of the example query to fail with the set-valued function... error. Also, the reason the ::varchar one did not fail was because not cast function was ever called but the ::varchar(30) forced a function call and thus prompted the error when the second record and resultant regexp_matches expression was encountered. Thanks! David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785629.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Fixing bug #8228 (set-valued function called in context that cannot accept a set)
David Johnston wrote Tom Lane-2 wrote I kinda forgot about this bug when I went off on vacation: http://www.postgresql.org/message-id/ E1UnCv4-0007oF-Bo@.postgresql Just to clarify: This patch will cause both executions of the example query to fail with the set-valued function... error. Also, the reason the ::varchar one did not fail was because not cast function was ever called but the ::varchar(30) forced a function call and thus prompted the error when the second record and resultant regexp_matches expression was encountered. Thanks! David J. Sorry for the imprecise English. I believe both of the following items but would like confirmation/clarification of my understanding. The whole varchar/varchar(30) discrepancy is bothersome and since the example forces a function-call via the use of lower(...), and doesn't test the non-function situation, I am concerned this patch is incorrect. If the first item is not true, i.e. this patch makes both alternatives work, then I think the wrong solution was chosen - or at least not fully vetted. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785631.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Fixing bug #8228 (set-valued function called in context that cannot accept a set)
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: The whole varchar/varchar(30) discrepancy is bothersome and since the example forces a function-call via the use of lower(...), and doesn't test the non-function situation, I am concerned this patch is incorrect. The reason casting to varchar(30) fails is that that results in invocation of a function (to enforce the length limit). Casting to varchar is just a RelabelType operation, which doesn't have two different code paths for set and not-set inputs. I did check the patch against your original example, but I thought using lower() made the purpose of the regression test case more apparent. Yeah, I caught that part. My focus was on the non-function version. Not being able to apply the patch and test myself it sounds like you likely made the function-invocation version succeed along with the original re-label-only version. I guess backward-compatibility concerns forces this solution but after thinking through what was happening I was leaning more toward making both queries fail. An SRF in a CASE expression seems like a foot-gun to me. SRF in the select-list is someone of a foot-gun too but understandable given the recency of the addition of LATERAL to our toolbox. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785636.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trailing comment ghost-timing
Andreas Karlsson wrote On 12/24/2013 02:05 AM, Erik Rijkers wrote: With \timing on, a trailing comment yields a timing. # test.sql select 1; /* select 2 */ $ psql -f test.sql ?column? -- 1 (1 row) Time: 0.651 ms Time: 0.089 ms I assume it is timing something about that comment (right?). Confusing and annoying, IMHO. Is there any chance such trailing ghost-timings can be removed? This seems to be caused by psql sending the comment to the server to evaluate as a query. I personally think timing should always output something for every command sent to the server. To fix this problem I guess one would have to modify psql_scan() to check if a scanned query only contains comments and then not send it to the server at all. The minimal file to reproduce it is: /**/ -- Andreas Karlsson I need to be convinced that the server should not just silently ignore trailing comments. I'd consider an exception if the only text sent is a comment ( in such a case we should throw an error ) but if valid commands are sent and there is just some comment text at the end it should be ignored the same as if the comments were embedded in the middle of the query text. I've encountered other clients that output phantom results in this situation and solving it at the server seems worthwhile so client applications do not have to care. In the example case, I think, putting the comment before the command results in only one timing. This inconsistency is a symptom of this situation being handled incorrectly. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/trailing-comment-ghost-timing-tp5784473p5784484.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] array_length(anyarray)
Marko Tiikkaja-4 wrote On 2013-12-18 22:32, Andrew Dunstan wrote: You're not really free to assume it - you'll need an exception handler for the other-than-1 case, or your code might blow up. This seems to be codifying a bad pattern, which should be using array_lower() and array_upper() instead. That's the entire point -- I *want* my code to blow up. If someone passes a multi-dimensional array to a function that assumes its input is one-dimensional and its indexes start from 1, I want it to be obvious that the caller did something wrong. Now I either copy-paste lines and lines of codes to always test for the weird cases or my code breaks in subtle ways. This is no different from an Assert() somewhere -- if the caller breaks the documented interface, it's his problem, not mine. And I don't want to waste my time coding around the fact that this simple thing is so hard to do in PG. 1) Why cannot we just make the second argument of the current function optional and default to 1? 2) How about providing a function that returns the 1-dim/lower=1 input array or raise/exception if the input array does not conform? not tested/psuedo-code CREATE FUNCTION array_normal(arr anyarray) RETURNS anyarray $$ begin if (empty(arr)) return arr; if (ndim(arr) 1) raise exception; if (array_lower() 1) raise exception return arr; end; $$ I can also see wanting 1-dimensional enforced without having to require the lower-bound to be 1 so maybe a separate function for that. Usage: SELECT array_length(array_normal(input_array)) I could see this being especially useful for a domain and/or column constraint definition and also allowing for a textbook case of separation of concerns. I am torn, but mostly opposed, to making an array_length(anyarray) function with these limitations enforced - especially if other similar functions are not created at the same time. I fully agree that array_length(anyarray) should be a valid call without requiring the user to specify , 1 by rote. Tangential Question: Is there any way to define a non-1-based array without using array-literal syntax but by using ARRAY[1,2,3] syntax? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/array-length-anyarray-tp5783950p5783972.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote From: David Johnston lt; polobo@ gt; 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \%s\ due to administrator command 5 and 6: I don't fully understand when they would happen but likely fall into the same the DBA should know what is going on with their server and confirm any startup/shutdown activity it is involved with. They might be better categorized NOTICE level if they were in response to a administrator action, versus in response to a crashed process, but even for the user-initiated situation making sure they hit the log but using FATAL is totally understandable and IMO desirable. #5 is output when the DBA shuts down the replication standby server. #6 is output when the DBA shuts down the server if he is using any custom background worker. These messages are always output. What I'm seeing as a problem is that FATAL messages are output in a normal situation, which worries the DBA, and those messages don't help the DBA with anything. They merely worry the DBA. While worry is something to be avoided not logging messages is not the correct solution. On the project side looking for ways and places to better communicate to the user is worthwhile in the absence of adding additional complexity to the logging system. The user/DBA, though, is expected to learn how the proces works, ideally in a testing environment where worry is much less a problem, and understand that some of what they are seeing are client-oriented messages that they should be aware of but that do not indicate server-level issues by themselves. They should probably run the log through a filter (a template of which the project should provide) that takes the raw log and filters out those things that, relative to the role and context, are really better classified as NOTICE even if the log-level is shown as FATAL. Raw logs should be verbose so that tools can have more data with which to make decisions. You can always hide what is provided but you can never show was has been omitted. I get that people wil scan raw logs but targeting that lowest-denominator isn't necessarily the best thing to do. Instead, getting those people on board with better practices and showing (and providing) them helpful tools should be the goal. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782267.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: programmable file format for postgresql.conf
Álvaro Hernández Tortosa wrote Note that you are not required to maintain your configuration data in a postgresql.conf-formatted file. You can keep it anywhere you like, GUI around in it, and convert it back to the required format. Most of the I think it is not a very good idea to encourage GUI tools or tools to auto-configure postgres to use a separate configuration file and then convert it to postgresql.conf. That introduces a duplicity with evil problems if either source of data is modified out-of-the-expected-way. That's why I'm suggesting a config file that is, at the same time, usable by both postgres and other external tools. That also enables other features such as editing the config file persistently through a SQL session. For my money I'd rather have a single file and/or directory-structure where raw configuration settings are saved in the current 'key = value' format with simple comments allowed and ignored by PostgreSQL. And being simple key-value the risk of out-of-the-expected-way changes would be minimal. If you want to put an example configuration file out there, one that will not be considered to the true configuration, with lots of comments and meta-data then great. I'm hoping that someday there is either a curses-based and even full-fledged GUI that beginners can use to generate the desired configuration. If we want to put a separate configuration meta-data file out there to basically provide a database from which third-party tools can pull out this information then great. I would not incorporate that same information into the main PostgreSQL configuration file/directory-structure. The biggest advantage is that the meta-data database can be readily modified without any concern regarding such changes impacting running systems upon update. Then, tools simply need to import two files instead of one, link together the meta-data key with the configuration key, and do whatever they were going to do anyway. If indeed that target audience is going to be novices then a static text-based document is not going to be the most desirable interface to present. At worse we should simply include a comment-link at the top of the document to a web-page where an interactive tool for configuration file creation would exist. That tool, at the end of the process, could provide the user with text to copy-paste/save into a specified area on the server so the customizations made would override the installed defaults. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-programmable-file-format-for-postgresql-conf-tp5781097p5782175.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Andrew Gierth wrote Tom == Tom Lane lt; tgl@.pa gt; writes: Please don't object that that doesn't look exactly like the syntax for calling the function, because it doesn't anyway --- remember you also need ORDER BY in the call. Tom Actually, now that I think of it, why not use this syntax for Tom declaration and display purposes: Tom type1, type2 ORDER BY type3, type4 Tom This has nearly as much relationship to the actual calling Tom syntax as the WITHIN GROUP proposal does, But unfortunately it looks exactly like the calling sequence for a normal aggregate with an order by clause - I really think that is potentially too much confusion. (It's one thing not to look like the calling syntax, it's another to look exactly like a valid calling sequence for doing something _different_.) -- Andrew (irc:RhodiumToad) How about: type1, type2 GROUP ORDER type3, type4 OR GROUP type1, type2 ORDER type3, type4 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-WITHIN-GROUP-patch-tp5773851p5782202.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote From: Tom Lane lt; tgl@.pa gt; There is no enthusiasm for a quick-hack solution here, and most people don't actually agree with your proposal that these errors should never get logged. So no, that is not happening. You can hack your local copy that way if you like of course, but it's not getting committed. Oh, I may have misunderstood your previous comments. I got the impression that you and others regard those messages (except too many clients) as unnecessary in server log. 1. FATAL: the database system is starting up 2. FATAL: the database system is shutting down 3. FATAL: the database system is in recovery mode 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \%s\ due to administrator command Could you tell me why these are necessary in server log? I guess like this. Am I correct? * #1 through #3 are necessary for the DBA to investigate and explain to the end user why he cannot connect to the database. * #4 and #5 are unnecessary for the DBA. I can't find out any reason why these are useful for the DBA. For me 1-3 are unusual events in production situations and so knowing when they occur, and confirming they occurred for a good reason, is a key job of the DBA. 5 and 6: I don't fully understand when they would happen but likely fall into the same the DBA should know what is going on with their server and confirm any startup/shutdown activity it is involved with. They might be better categorized NOTICE level if they were in response to a administrator action, versus in response to a crashed process, but even for the user-initiated situation making sure they hit the log but using FATAL is totally understandable and IMO desirable. I'd ask in what situations are these messages occurring so frequently that they are becoming noise instead of useful data? Sorry if I missed your use-case explanation up-thread. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782234.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
David Johnston wrote MauMau wrote From: Tom Lane lt; tgl@.pa gt; There is no enthusiasm for a quick-hack solution here, and most people don't actually agree with your proposal that these errors should never get logged. So no, that is not happening. You can hack your local copy that way if you like of course, but it's not getting committed. Oh, I may have misunderstood your previous comments. I got the impression that you and others regard those messages (except too many clients) as unnecessary in server log. 1. FATAL: the database system is starting up 2. FATAL: the database system is shutting down 3. FATAL: the database system is in recovery mode 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \%s\ due to administrator command Could you tell me why these are necessary in server log? I guess like this. Am I correct? * #1 through #3 are necessary for the DBA to investigate and explain to the end user why he cannot connect to the database. * #4 and #5 are unnecessary for the DBA. I can't find out any reason why these are useful for the DBA. For me 1-3 are unusual events in production situations and so knowing when they occur, and confirming they occurred for a good reason, is a key job of the DBA. 5 and 6: I don't fully understand when they would happen but likely fall into the same the DBA should know what is going on with their server and confirm any startup/shutdown activity it is involved with. They might be better categorized NOTICE level if they were in response to a administrator action, versus in response to a crashed process, but even for the user-initiated situation making sure they hit the log but using FATAL is totally understandable and IMO desirable. I'd ask in what situations are these messages occurring so frequently that they are becoming noise instead of useful data? Sorry if I missed your use-case explanation up-thread. David J. Went and scanned the thread: PITR/Failover is not generally that frequent an occurrence but I will agree that these events are likely common during such. Maybe PITR/Failover mode can output something in the logs to alleviate user angst about these frequent events? I'm doubting that a totally separate mechanism can be used for this mode but instead of looking for things to remove how about adding some additional coddling to the logs and the beginning and end of the mode change? Thought provoking only as I have not actually been a user of said feature. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782235.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote From: Tom Lane lt; tgl@.pa gt; There is no enthusiasm for a quick-hack solution here, and most people don't actually agree with your proposal that these errors should never get logged. So no, that is not happening. You can hack your local copy that way if you like of course, but it's not getting committed. Oh, I may have misunderstood your previous comments. I got the impression that you and others regard those messages (except too many clients) as unnecessary in server log. 1. FATAL: the database system is starting up How about altering the message to tone down the severity by a half-step... FATAL: (probably) not! - the database system is starting up David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782236.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
Tom Lane-2 wrote MauMau lt; maumau307@ gt; writes: Shouldn't we lower the severity or avoiding those messages to server log? No. They are FATAL so far as the individual session is concerned. Possibly some documentation effort is needed here, but I don't think any change in the code behavior would be an improvement. 1. FATAL: the database system is starting up 2. FATAL: the database system is shutting down 3. FATAL: the database system is in recovery mode 4. FATAL: sorry, too many clients already Report these as FATAL to the client because the client wants to know the reason. But don't output them to server log because they are not necessary for DBAs (4 is subtle.) The notion that a DBA should not be allowed to find out how often #4 is happening is insane. Agreed #4 is definitely DBA territory. ISTM that instituting some level of categorization for messages would be helpful. Then logging and reporting frameworks would be able to identify and segregate the logs in whatever way they and the configuration deems appropriate. FATAL: [LOGON] too many clients already I'd make the category output disabled by default for a long while then eventually enabled by default but leave the ability to disable. Calls that do not supply a category get [N/A] output in category mode. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5781925.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Tom Lane-2 wrote Further questions about WITHIN GROUP: I believe that the spec requires that the direct arguments of an inverse or hypothetical-set aggregate must not contain any Vars of the current query level. They don't manage to say that in plain English, of course, but in the hypothetical set function case the behavior is defined in terms of this sub-select: ( SELECT 0, SK1, ..., SKK FROM TNAME UNION ALL VALUES (1, VE1, ..., VEK) ) where SKn are the sort key values, TNAME is an alias for the input table, and VEn are the direct arguments. Any input-table Vars the aggregate might refer to are thus in scope only for the SKn not the VEn. (This definition also makes it clear that the VEn are to be evaluated once, not once per row.) In the inverse distribution function case, they resort to claiming that the value of the inverse distribution function argument can be replaced by a numeric literal, which again makes it clear that it's supposed to be evaluated just once. So that's all fine, but the patch seems a bit confused about it. If you try to cheat, you get an error message that's less than apropos: regression=# select cume_dist(f1) within group(order by f1) from text_tbl ; ERROR: column text_tbl.f1 must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select cume_dist(f1) within group(order by f1) from text_tbl... ^ I'd have hoped for a message along the line of fixed arguments of cume_dist() must not contain variables, similar to the phrasing we use when you try to put a variable in LIMIT. Also, there are some comments and code changes in nodeAgg.c that seem to be on the wrong page here: + /* +* Use the representative input tuple for any references to +* non-aggregated input columns in the qual and tlist.(If we are not +* grouping, and there are no input rows at all, we will come here +* with an empty firstSlot ... but if not grouping, there can't be any +* references to non-aggregated input columns, so no problem.) +* We do this before finalizing because for ordered set functions, +* finalize_aggregates can evaluate arguments referencing the tuple. +*/ + econtext-ecxt_outertuple = firstSlot; + The last two lines of that comment are new, all the rest was moved from someplace else. Those last two lines are wrong according to the above reasoning, and if they were correct the argument made in the pre-existing part of the comment would be all wet, meaning the code could in fact crash when trying to evaluate a Var reference in finalize_aggregates. So I'm inclined to undo this part of the patch (and anything else that's rearranging logic with an eye to Var references in finalize_aggregates), and to try to fix parse_agg.c so it gives a more specific error message for this case. The original design references the spec as allowing a column reference if it is a group by key: Select cume_dist(f1) within group ( order by f1 ) from text_tbl group by f1 No example was shown where this would be useful...but nonetheless the comment and error both presume that such is true. I would presume the implementation would only supply rows for sorting from the single group in question so for practical purposes the columns have a constant value for the entirety of the evaluation and so this makes sense and acts just like partition by on a window aggregate. Question, are there any tests that exercise this desired behavior? That assuming agreement can be had that the group by behavior is indeed spec or something custom we want to support. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-WITHIN-GROUP-patch-tp5773851p5782041.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add full object name to the tag field
Robert Haas wrote On Mon, Dec 2, 2013 at 9:49 AM, Asit Mahato lt; rigid.asit@ gt; wrote: Hi all, I am a newbie. I am unable to understand the to do statement given below. Add full object name to the tag field. eg. for operators we need '=(integer, integer)', instead of just '='. please help me out with an example. Thanks and Regards, Asit Mahato Cast the OID of the operator to regoperator instead of regoper. This seems too simple an answer to be useful, and utterly confusing to the OP. The ToDo item in question is in the pg_dump/pg_restore section. In would seem to be possibly referring to this e-mail thread: Re: pg_dump sort order for functions http://www.postgresql.org/message-id/flat/9837222c1001120712t1e64eb2fg9f502757b6002...@mail.gmail.com#9837222c1001120712t1e64eb2fg9f502757b6002...@mail.gmail.com though there is no link to prior discussion attached to the ToDo item. The thread itself suggests there was yet prior discussion on the topic though my quick search did not turn anything up. It seems that not all objects (though it appears functions are currently one exception) are fully descriptive in their tag/name output which make deterministic ordering more difficult. The goal is, say for operators, to output not only the base operator symbol (regoper) but the types associated with the left-hand and right-hand sides (regoperator). The additional type information makes the entire name unique (barring cross-schema conflicts at least, which can be mitigated) and allows for deterministic ordering. Asit, hopefully this gives you enough context to ask better questions which others can answer more fully. Also, it may help to give more details about yourself and your goals and not just throw out that you are a newbie. The later gives people little guidance in how they should structure their help. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Add-full-object-name-to-the-tag-field-tp5781167p5781431.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Robert Haas wrote Issuing command ROLLBACK / outside of a transaction block has the sole effect of emitting a warning. Sure, that sounds OK. ...Robert +1 for: Issuing commandROLLBACK/ outside of a transaction block has no effect except emitting a warning. In all of these cases we are assuming that the user understands that emitting a warning means that something is being logged to disk and thus is causing a resource drain. I like explicitly saying that issuing these commands is pointless/has no effect; being indirect and saying that the only thing they do is emit a warning omits any explicit explicit explanation of why. And while I agree that logging the warning is an effect; but it is not the primary/direct effect that the user cares about. I would maybe change the above to: *Issuing commandROLLBACK/ outside of a transaction block has no effect: thus it emits a warning [to both user and log file].* I do like thus instead of except due to the explicit causality link that is establishes. We emit a warning because what you just did is pointless. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5780825.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_ouput = escape vs encode(byte, 'escape')
Jim Nasby-2 wrote I'm wondering why bytes_output = escape produces different output than encode(byte, 'escape') does. Is this intentional? If so, why? cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf; cr | lf --+-- \x0d | \x0a (1 row) cnuapp_prod@postgres=# set bytea_output = escape; SET cnuapp_prod@postgres=# select e'\r'::bytea AS cr, e'\n'::bytea AS lf; cr | lf --+-- \015 | \012 (1 row) cnuapp_prod@postgres=# select encode(e'\r'::bytea,'escape') AS cr, encode(e'\n'::bytea, 'escape') AS lf; cr | lf + \r | + | (1 row) cnuapp_prod@postgres=# encode takes a bytea and provides what it would be as a text (using the specified encoding to perform the conversion). the bytea output examples are simple output of the contents of the byte-array without an supposition as to what those bytes represent. It is strictly a serialization format and not an encoding/decoding of the contents. In this example the two functions are acting as paired input/output. I'm thinking the direction you are assuming from the word encode is confusing you - as it did me at first. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/bytea-ouput-escape-vs-encode-byte-escape-tp5780643p5780647.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Bruce Momjian wrote - Issuing command ABORT / when not inside a transaction does - no harm, but it will provoke a warning message. + Issuing command ABORT / outside of a transaction block has no effect. Those things are not the same. Uh, I ended up mentioning no effect to highlight it does nothing, rather than mention a warning. Would people prefer I say warning? Or should I say issues a warning because it has no effect or something? It is easy to change. I'd revert the change Robert highlights above. ISTM you've changed the code to match the documentation; why would you then change the docs? Well, I did it to make it consistent. The question is what to write for _all_ of the new warnings, including SET. Do we say warning, do we say it has no effect, or do we say both? The ABORT is a just one case of that. How about: Issuing command outside of a transaction has no effect and will provoke a warning. I dislike does no harm because it can if someone thinks the current state is different than reality. It is good to indicate that a warning is emitted if this is done in error; thus reinforcing the fact people should be looking at their warnings. when not inside uses a negative modifier while outside is more direct and thus easier to read, IMO. The phrase transaction block seems wordy so I omitted the word block. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5780378.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
Josh Berkus wrote On 11/25/2013 03:36 PM, David Johnston wrote: Doh! IF / THEN / ELSE / ENDIF (concept, not syntax) That also does help to reinforce the point being made here... David J. What point? That the status-quo should be maintained. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780425.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
AK wrote Kevin, I do see your logic now, but this thing is a common mistake - it means that this seems counter-intuitive to some people. What would happen if we applied Occam's razor and just removed this rule? All existing code would continue to work as is, and we would have one less rule to memorize. That would make PostgreSql a slightly better product, right? I'm somewhat on the fence for this but am leaning toward maintaining status-quo. Mostly because of the analogy with IF ... END IF; versus the SQL BEGIN; command which is a entirely separate construct. I would maybe change the documentation so that instead of simply dictating a rule we explain why the syntax is the way it is - like this thread is doing. If they consciously omit the semi-colon hopefully they also understand that what they are beginning is a code-block in plpgsql as opposed to an SQL transaction. That said, technical purity isn't always a good answer. I'd be inclined to let someone passionate enough about the idea implement it an critique instead of dis-allowing it outright; but in the end that is likely to result in the same end. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780222.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
Mark Kirkwood-2 wrote Postgres supports many procedural languages (e.g plperl, plpython) and all these have different grammar rules from SQL - and from each other. We can't (and shouldn't) try altering them to be similar to SQL - it would defeat the purpose of providing a procedural environment where the given language works as advertised. So in the case of plpgsql - it needs to follow the Ada grammar, otherwise it would be useless. I do not follow the useless conclusion - what, present day, does Ada got to do with it? And the request is to alter only plpgsql, not all the other languages. To the casual end-user plpgsql is an internal language under our full control and installed by default in all new releases. Is it really unreasonable to expect us to design in some level of coordination between it and SQL? Cross-compatibility is a valid reason though I'm guessing with all the inherent differences between our standard PL and other database's PLs that making this change would not be a materially noticeable additional incompatibility. I'll even accept language consistency and not worth the effort of special-casing but mostly because the error is immediate and obvious, and the solution is simple and readily learned. A side observation: why does DECLARE not require a block-end keyword but instead BEGIN acts as effectively both start and end? BEGIN, IF, FOR, etc... all come in pairs but DECLARE does not. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780245.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why semicolon after begin is not allowed in postgresql?
Andrew Dunstan wrote On 11/25/2013 06:13 PM, David Johnston wrote: A side observation: why does DECLARE not require a block-end keyword but instead BEGIN acts as effectively both start and end? BEGIN, IF, FOR, etc... all come in pairs but DECLARE does not. A complete block is: [ DECLARE declarations ] BEGIN statements [ EXCEPTIONS handlers ] END The declare and exceptions parts are optional, as indicated. Does that make it clearer? Doh! IF / THEN / ELSE / ENDIF (concept, not syntax) That also does help to reinforce the point being made here... David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/why-semicolon-after-begin-is-not-allowed-in-postgresql-tp5779905p5780250.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is UPDATE with column-list syntax not implemented
AK wrote 9.3 documentation says: According to the standard, the column-list syntax should allow a list of columns to be assigned from a single row-valued expression, such as a sub-select: UPDATE accounts SET (contact_last_name, contact_first_name) = (SELECT last_name, first_name FROM salesmen WHERE salesmen.id = accounts.sales_id); This is not currently implemented — the source must be a list of independent expressions. Why is this not implemented? Is it considered inconvenient to use, or difficult to implement. or not important enough, or some other reason? I cannot answer why but I too would like to see this. I actually asked this a long while back but cannot seem to find my posting or recall the response. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Why-is-UPDATE-with-column-list-syntax-not-implemented-tp5779600p5779601.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] additional json functionality
Hannu Krosing-3 wrote On 11/18/2013 06:49 PM, Josh Berkus wrote: On 11/18/2013 06:13 AM, Peter Eisentraut wrote: On 11/15/13, 6:15 PM, Josh Berkus wrote: Thing is, I'm not particularly concerned about *Merlin's* specific use case, which there are ways around. What I am concerned about is that we may have users who have years of data stored in JSON text fields which won't survive an upgrade to binary JSON, because we will stop allowing certain things (ordering, duplicate keys) which are currently allowed in those columns. At the very least, if we're going to have that kind of backwards compatibilty break we'll want to call the new version 10.0. We could do something like SQL/XML and specify the level of validity in a typmod, e.g., json(loose), json(strict), etc. Doesn't work; with XML, the underlying storage format didn't change. With JSONB, it will ... so changing the typemod would require a total rewrite of the table. That's a POLS violation if I ever saw one We do rewrites on typmod changes already. To me having json(string) and json(hstore) does not seem too bad. Three things: 1) How would this work in the face of functions that erase typemod information? 2) json [no type mod] would have to effectively default to json(string)? 3) how would #1 and #2 interact? I pondered the general idea but my (admittedly limited) gut feeling is that using typemod would possibly be technically untenable and from an end-user perspective would be even more confusing than having two types. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5779428.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH ORDINALITY versus column definition lists
Tom Lane-2 wrote It seems to me that we don't really want this behavior of the coldeflist not including the ordinality column. It's operating as designed, maybe, but it's unexpected and confusing. We could either 1. Reinsert HEAD's prohibition against directly combining WITH ORDINALITY with a coldeflist (with a better error message and a HINT suggesting that you can get what you want via the TABLE syntax). 2. Change the parser so that the coldeflist is considered to include the ordinality column, for consistency with the bare-alias case. We'd therefore insist that the last coldeflist item be declared as int8, and then probably have to strip it out internally. #2 but I am hoping to be able to make the definition of the column optional. One possibility is that if you do want to provide an alias you have to make it clear that the coldeflist item in question is only valid for a with ordinality column alias. Otherwise the entire coldeflist is used to alias the record-type output and the ordinality column is provided its default name. Two options I came up with: 1) disallow any type specifier on the last item: t(f1 int, f2 text, o1) 2) add a new pseudo-type, ord: t(f1 int, f2 text, o1 ord) I really like option #2. It makes it perfectly clear, entirely within the coldeflist SQL, that the last column is different and in this case optional both in the sense of providing an alias and also the user can drop the whole ordinality aspect of the call as well. The system does not need to be told, by the user, the actual type of the ordinality column. And given that I would supposed most people would think to use int or bigint before using int8 the usability there is improved once they need and then learn that to alias the ordinality column they use the ord type which would internally resolve to the necessary output type. Option one is somewhat simpler but the slight added verbosity makes reading the SQL coldeflist easier, IMO, since you are already scanning name-type pairs and recognizing the missing type is, for me, harder than reading off ord and recalling its meaning. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779449.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH ORDINALITY versus column definition lists
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: Tom Lane-2 wrote It seems to me that we don't really want this behavior of the coldeflist not including the ordinality column. It's operating as designed, maybe, but it's unexpected and confusing. We could either 1. Reinsert HEAD's prohibition against directly combining WITH ORDINALITY with a coldeflist (with a better error message and a HINT suggesting that you can get what you want via the TABLE syntax). 2. Change the parser so that the coldeflist is considered to include the ordinality column, for consistency with the bare-alias case. We'd therefore insist that the last coldeflist item be declared as int8, and then probably have to strip it out internally. Two options I came up with: 1) disallow any type specifier on the last item: t(f1 int, f2 text, o1) 2) add a new pseudo-type, ord: t(f1 int, f2 text, o1 ord) I really like option #2. I don't. Pseudo-types have a whole lot of baggage. #1 is a mess too. And in either case, making coldef list items optional increases the number of ways to make a mistake, if you accidentally omit some other column for instance. I'll have to trust on the baggage/mess conclusion but if you can distinctly and un-ambigiously identify the coldeflist item that is to be used for ordinality column aliasing then the mistakes related to the function-record-coldeflist are the same as now. There may be more (be still quite few I would think) ways for the user to make a mistake but the syntax ones are handled anyway and so if the others can be handled reasonably well the UI for the feature becomes more friendly. IOW, instead of adding int8 and ignoring it we poll the last item, conditionally discard it (like the int8 case), then handle the possibly modified structure as planned. Basically the problem here is that it's not immediately obvious whether the coldef list ought to include the ordinality column or not. The user would probably guess not (since the system knows what type ordinality should be). Yes, if the column is not made optional somehow then I dislike option #2 The TABLE syntax is really a vastly better solution for this. So I'm thinking my #1 is the best answer, assuming we can come up with a good error message. My first attempt would be ERROR: WITH ORDINALITY cannot be used with a column definition list HINT: Put the function's column definition list inside TABLE() syntax. Better ideas? Works for me if #1 is implemented. Just to clarify we are still allowing simple aliasing: select * from generate_series(1,2) with ordinality as t(f1,f2); Its only when the output of the function is record does the restriction of placing the record-returning function call into TABLE (if you want ordinals) come into play. select * from table(array_to_set(array['one', 'two']) as (f1 int,f2 text)) with ordinality as t(a1,a2,a3); If we could do away with having to re-specify the record-aliases in the outer layer (a1, a2) then I'd be more understanding but I'm thinking that is not possible unless you force a single-column alias definition attached to WITH ORDINALITY to mean alias the ordinality column only. On the plus side: anyone using record-returning functions is already dealing with considerable verbosity so this extra bit doesn't seem to be adding that much overhead; and since the alias - t(a1,a2,a3) - is optional if you don't care about aliasing the with ordinal column the default case is not that verbose (just add the surrounding TABLE). I feel like I need a flow-chart for #1... With #2 (w/ optional) you can add in an alias for the ordinality column anyplace you would be specifying a coldeflist OR alias list. Favoring the pseudo-type solution is the fact that given the prior sentence if you place o1 ord in the wrong place it is possible to generate an error like with ordinality not present for aliasing. #1 is simpler to implement and does not preclude #2 in the future. Possible #3? Not sure if this is possible at this point but really the alias for the ordinality column would be attached directly to the ordinality keyword. e.g., ...) with ordinality{alias} as t(a1, a2) David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779468.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH ORDINALITY versus column definition lists
Tom Lane-2 wrote David Johnston lt; polobo@ gt; writes: Just to clarify we are still allowing simple aliasing: select * from generate_series(1,2) with ordinality as t(f1,f2); Right, that works (and is required by spec, I believe). It's what to do with our column-definition-list extension that's at issue. Not sure if this is possible at this point but really the alias for the ordinality column would be attached directly to the ordinality keyword. e.g., ...) with ordinality{alias} as t(a1, a2) This has no support in the standard. Now I'm just spinning some thoughts: ) with ordinality AS t(a1 text, a2 text | ord1) -- type-less, but a different separator ) with ordinality AS t(a1 text, a2 text)(ord1) -- stick it in its own section, type-less ) with ordinality AS t(a1 text, a2 text) ordinal(ord1) --name the section too would probably want to extend the alias syntax to match... Is there any precedent in other RDBMS to consider? I don't see any obvious alternatives to the ones you listed and syntax is really not a huge barrier. If the implementation of an optionally specified alias is a barrier then either someone needs to feel strongly enough to implement it or just default to #1 for the time being. But others really haven't had a chance to read and respond yet so I'm gonna get off this train for a while. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/WITH-ORDINALITY-versus-column-definition-lists-tp5779443p5779473.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers