Re: [HACKERS] Cluster wide option to control symbol case folding
On Mon, Jan 2, 2017 at 01:25:35PM -0500, Robert Haas wrote: > > But, I can easily imagine a good number of people deciding they want > > mixed case on the server, and so quoting their identifiers. And, then > > deciding PostgreSQL is defective, rather than deciding their favorite > > administration or query tool is defective. Almost all of the tools I > > tried worked fine when I had all lower case symbols on the server. Based > > on observing the generated SQL, most of the tools that failed for me > > when I had mixed case symbols on the server would work against a case > > preserving mode in PostgreSQL. The tools generally pass through the > > catalog reported symbols without manipulation. > > Right. Tom's argument makes a lot of sense when you think about this > from the point of view of someone writing extensions or tools that are > designed to work with anybody's PostgreSQL instance. When you think > about it from the point of view of a user wanting to write an > application that can work with any of a number of databases, then your > argument has a lot of merit to it. I'm not sure there's any way to > split the baby here: tool authors will obviously prefer that > PostgreSQL's behavior in this area be invariable, while people trying > to develop portable database applications will prefer configurability. > As far as I can see, this is a zero sum game that is bound to have one > winner and one loser. Please let me restate the above. For those working only in the Postgres ecosystem, the rules are pretty clear --- quote nothing and use only lowercase, or quote everything. The reason "quote nothing" is insufficient is that tools like pgAdmin will quote mixed-case identifiers during object creation, so technically it is difficult to have pure "quote nothing" behavior in Postgres unless you control all the tooling. Now, clearly, we have users coming from non-Postgres databases where the behavior is different, i.e. Oracle might be "quote nothing and use only uppercase". It seems SQLAnywhere is "quote nothing and case is preserved". The problem with opening Postgres up to user-modifiable case folding is that Postgres ecosystem queries will have to adjust to the fact that case folding is no longer predictable. For database applications the fix might be as easy as changing the session state, but for extensions and libraries, they would need to quote _everything_ to handle uncontrollable case folding behavior. So, in a way, the crazy quoting we are trying to avoid for migrated queries now happens in the Postgres ecosystem, and we might even have more of it. This basically pushes the quoting overhead from users moving to Postgres from other databases to Postgres ecosystem tooling. Whether that is better or worse overall is a judgement call, as Robert stated. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
Robert Haas [mailto:robertmh...@gmail.com] wrote: >> Where you get into trouble there is that you might run CREATE EXTENSION >> from that session Yes. I can see this problem. And, while I can imagine resolving it with context belonging to the extension, separate from the current session's context, any resolution gets to be pretty complex. Probably complex enough that the resolution is worse than living with the problem as part of the cost of the feature. Which means, it remains one of the arguments against it. >> Again, I'm not trying to rain down fire and brimstone >> on your idea here and I clearly see the utility of it. I do not feel this at all (though, since this is e-mail, it is helpful that you state it explicitly). And, I have not felt like this was the case at any point in these discussions. I have consistently received thoughtful and remarkably good responses containing solid points. While I did not recognize all the impacts, and I probably still do not, I never thought this was a trivial issue with no arguments against it (not least, that what I was asking for is not standard compliant). Every modal behavior of any sort in any software adds pain. At the very least it increases the regression testing burden. And, this is a mode with pretty fundamental impact. It has to be worth a lot to somebody to be worth having. And, of course, it also has to work. Ian Lewis (www.mstarlabs.com) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On Tue, Jan 3, 2017 at 6:45 PM, Lewis, Ian (Microstar Laboratories)wrote: > One idea, which would likely be harder to implement on the server, but > that would have less impact on third party tools and libraries, would be > to configure case folding on a session basis. There would have to be > some means to configure a session for the case folding your application > wants to see. And, the general default would have to be the current > PostgreSQL behavior so that an application that was designed for current > behavior would never see a change. Where you get into trouble there is that you might run CREATE EXTENSION from that session, or call an SQL function defined in a session with different settings (perhaps a function created by an extension). This is not unlike various problems we've had over the years with search_path, which really ought to be lexically scoped but is in fact dynamically scoped. Again, I'm not trying to rain down fire and brimstone on your idea here and I clearly see the utility of it. I also think it's great that you've engaged in the discussion in the way that you have. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
Tom Lane [mailto:t...@sss.pgh.pa.us] wrote: >> 2. If the folding mode is chosen through a GUC variable, which >> is certainly what people would expect, then it turns out that >> it breaks client libraries/applications *anyway*, because an >> installation-wide setting could impose itself on a client that >> hadn't asked for it. I know that some variables can only be configured at a wide scope, and not a narrow one. Is there no way to restrict a GUC variable's configuration scope to session and finer, but force a fixed value at global scope? If it is possible to restrict global configuration, that at least protects the general purpose administrative tools to a significant degree. >> And for libraries, that isn't a great solution because then they're incompatible with applications that wanted another setting. Good point. Libraries continue to have problems even with session level configuration if they are to operate in the context of an application that reconfigures its session case folding for its own purposes. But, that seems like a problem that is much more likely to affect developers of new systems rather than general users or administrators of existing database systems. If so, it is more of a forward looking problem than a legacy problem in the sense that the person who encounters it is likely to be in a position to do something about it. This makes it much less critical to get every library in the world updated to support all case folding modes than would be the case for general administrative tools like pgAdmin. Depending on the nature of the library, a developer would have the option of using multiple sessions or, perhaps, if it were possible, modifying the folding configuration when using the library. Anyhow, as you say, libraries clearly continue to have issues even with restricted scope on case folding configuration. And the session level idea really helps nothing unless the global default session configuration is fixed. Ian Lewis (www.mstarlabs.com) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
"Lewis, Ian \(Microstar Laboratories\)"writes: > One idea, which would likely be harder to implement on the server, but > that would have less impact on third party tools and libraries, would be > to configure case folding on a session basis. There are a couple of problems even with that: 1. All the sessions have to share the same catalog state, which greatly restricts what you could do. 2. If the folding mode is chosen through a GUC variable, which is certainly what people would expect, then it turns out that it breaks client libraries/applications *anyway*, because an installation-wide setting could impose itself on a client that hadn't asked for it. So you have to update everything at least to the extent of teaching it to turn off setting X when talking to server versions >= Y. And for libraries, that isn't a great solution because then they're incompatible with applications that wanted another setting. The notion that the client side is a monolithic chunk doesn't withstand scrutiny; really there's usually a stack of code over there, and it all has to cope with the SQL semantics we expose. Point #2 was really the lesson that we learned the hard way with the autocommit fiasco. We'd thought going into it that client-side code could be updated only when somebody wanted to use the new behavior, and it took awhile to absorb the fact that much code would be forced to deal with the behavior whether it wanted to or not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
Robert Haas [mailto:robertmh...@gmail.com] wrote: > The issue is, rather, that every extension written for > PostgreSQL, whether in or out of core, needs to handle this issue and > every general-purpose client tool (pgAdmin, etc.) needs to be aware of > it. I can see the accuracy of all of the points you make here. And, I definitely had not thought through the side effects on support tools and third party libraries of implementing such modal behavior on the server when I originally asked my question. I did not even understand the ramifications of upper case folding on the server until Tom pointed out the earlier conversations on the subject (in my defense, I was not confused enough to think I had thought through all the effects of a fundamental change to language recognition based on writing one e-mail message). A fully case sensitive mode, leaving the server catalogs all in lower case, which is what we would really like to have for our use, still looks pretty easy to implement on the server. And, it would at least behave consistently with the lower case folding mode if one quoted all identifiers, unlike a case preserving, case insensitive mode. One idea, which would likely be harder to implement on the server, but that would have less impact on third party tools and libraries, would be to configure case folding on a session basis. There would have to be some means to configure a session for the case folding your application wants to see. And, the general default would have to be the current PostgreSQL behavior so that an application that was designed for current behavior would never see a change. While not quite obvious to me how one would implement this for all client environments, it would make such a feature more useful if it included a means to make the configuration outside of the scope of an application itself so that one could give an application over which one has no control the behavior it expects. That is, provide a means to configure a specific application's session default behavior on the client. But, provide no means to configure the server's general default behavior so that the server itself is never modal with respect to case folding. Only the client session is modal. It is pretty easy to see the pain of adding symbol case folding modes. On the other hand, there is no way to know exactly the gain (or loss) in adoption to providing alternate case folding. So, you have one fact (the pain) and one speculation (the gain). I can see that makes deciding whether this is a good or bad idea for the project not at all easy. Anyhow, I appreciate the time you, and others, have taken to explain your thinking and the impacts of adding modal case folding to the server. Ian Lewis (www.mstarlabs.com) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On 1/2/17 2:01 PM, Greg Stark wrote: The PostGIS extensions might not work on your system with different case rules if they haven't been 100% consistent with their camelCasing FWIW I've already run into a similar problem with inter-extension dependencies and relocatability. I've found hacks to work around this during extension installation (ie: query pg_extension.extnamespace in the dependent extension build script), but if the other extension gets relocated you're hosed. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
Lewis, Ian (Microstar Laboratories) wrote: > PS. To anyone who might know the answer: My Reply All to this group does > not seem to join to the original thread. All I am doing is Reply All > from Outlook. Is there something else I need to do to allow my responses > to join the original thread? That's a known deficiency with Outlook, which fails to include the required headers (References and/or In-Reply-To). We have a few threads like that in the archives. As far as I know there's no workaround; the only solution is to change to another mail program. Perhaps there are config options that can be changed, but I've asked a few people that have had this problem and nobody has found anything -- but I don't know how hard they've tried, if at all. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On Mon, Jan 2, 2017 at 8:03 PM, Lewis, Ian (Microstar Laboratories)wrote: > Personally, I believe such an option would increase, not decrease the > number of people who could relatively easily use PostgreSQL. If that is > right it is a strong argument for such a modal behavior in spite of the > obvious real pain. That is definitely true. "X will let more people easily use PostgreSQL" is an argument with a lot of merit, and I don't see how anyone could argue otherwise (unless they want fewer people to use PostgreSQL). I think the issue isn't so much whether we'd increase or decrease the number of people who could easily use PostgreSQL; I'm confident that, as you say, many potential users would find it convenient. The issue is, rather, that every extension written for PostgreSQL, whether in or out of core, needs to handle this issue and every general-purpose client tool (pgAdmin, etc.) needs to be aware of it. Many drivers probably need to know about it. Any application built on PostgreSQL must either now require a specific server configuration or be prepared to work with any of them. I think this is the sort of thing where the server changes would take a month and tools, connectors, and extensions would still be getting bug reports a decade later. Now, I am not as convinced as Tom is that this is a categorically terrible idea. But I do think that it would be easy to underestimate the amount of collateral damage that such a change would cause. It would be very large. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
From: Robert Haas [mailto:robertmh...@gmail.com] wrote: > I'm not sure there's any way to split the baby here: tool authors will obviously prefer that PostgreSQL's behavior in this area be invariable, while people trying to develop portable database applications will prefer configurability. > As far as I can see, this is a zero sum game that is bound to have one winner and one loser. Tom is clearly right that such modes make life harder in a fundamental way for anyone writing only against PostgreSQL. And, excepting the upper case folding option, which is of no interest at all to me personally - I do not care which case folding messes up my symbol declarations, it would move PostgreSQL away from the standard rather than closer to it (against that, however, PostgreSQL has many features that are not part of the standard, including its existing lower case folding). If he is also right that addition of such an option would deteriorate into a situation where more people think PostgreSQL is broken, rather than fewer people thinking that, as I think would be the case, I have no strong argument for why PostgreSQL - as a project - should support such modal behavior. Personally, I believe such an option would increase, not decrease the number of people who could relatively easily use PostgreSQL. If that is right it is a strong argument for such a modal behavior in spite of the obvious real pain. And, from what I can see, many, maybe most, general purpose tool authors target many backends. So, they already have to deal with some signficiant degree of variation in case folding behavior. So, I do not really see this as a zero sum game. It is a question of whether such an option would grow the user base. If not, it is clearly a bad idea for the project. But, if it would grow the user base sufficiently, then, yes, there is pain for those who write general purpose tools aimed only at PostgreSQL. But, such tools gain from a wider adoption of PostgreSQL. Ian Lewis (www.mstarlabs.com) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark wrote: > But the problem with configurable quoting rules is a bit different. > Imagine your application later decides to depend on PostGIS. So you load the PostGIS extension and perhaps also some useful functions you found on Stack Overflow for solving some GIS problem you have. Those extensions will create objects and then work with those objects and may use CamelCase for clarity -- in > fact I think PostGIS functions are documented as CamelCase. The PostGIS extensions might not work on your system with different case rules if they haven't been 100% consistent with their camelCasing, and the functions from StackOverflow would be even less likely to work. Well, in the case of StackOverflow suggestions, I cannot remember a time when I did not have to rewrite whatever suggestions I have found and used. That is not to say that StackOverflow is not useful. It is incredibly useful at times. But, the suggestions are usually fragments showing how to do something, not solutions. And, most such suggestions are small. And, so, relatively easy to understand and patch as needed. Many such suggestions are not very useful verbatim anyhow. They are useful exactly because they allow you to understand something that you were unable to glean from the documentation. Certainly, making symbol usage consistent is not a hard patch on a small fragment of code that probably needs help anyhow to bring it to production grade. I would not consider this a strong argument against having modal symbol recognition. Your point about PostGIS, and other full or partial solutions for a complex problem, is a more serious issue. I do not have a strong answer to this point. However, at the least a CamelCase case defect in a tool is a pretty easy problem to locate and submit as a patch. (I understand that your point is not just about PostGIS, but for PostGIS itself I have read in a few places that they quote everything already. I do not know whether that is true or not as I have never even looked at the tool. However, if it is true they quote everything, then they already have their CamelCase exactly right everywhere. If they did not the symbol lookup would fail against current PostgreSQL. Any tool that quotes everything should work the same way against any mode as long as all modes are case sensitive. It might be ugly, but at least it should always work no matter what the back end case translation.) In our own code, I actually would prefer that we were forced to always use the same case everywhere we refer to a symbol. And a case sensitive behavior would enforce that at testing. I do not want this because I want to be able to define symbols that differ only in case. I want it so that every symbol reference is exactly visually like every other symbol reference to the same object. Even though the effect is small, I think such consistency makes it easier to read code. Even in C we almost never use the ability to overload on case alone except in a few rare - and localized - cases where the code is actually clearer with such a notation. For example, in a mathematical implementation, using a notation where something like t acts as an index and T defines the range of t the difference in case is very clear. Perhaps more importantly, this use of overload on case is consistent with conventional mathematical notation (which, in my opinion is very good where it belongs). This is not true when dealing with TheLongSymbolWithMixedCase vs. TheLongSymbolWithMixedcase. The human brain cannot see that difference easily, while it can see the difference between t and T very easily, and it can see the relationship between the two symbols more easily than it can see the relationship between t and tmax, say. Still, we almost never have such code running on a database server. Anyhow, you have a good point about third party libraries and tools that integrate with PostgreSQL. However, I for one would be willing to live with and address that kind of issue as needed. If the behavior were controlled at database create time, which, from the articles Tom linked, seems to be the general consensus as the right time for such a choice given the current implementation, then one would at least have the option of having databases with different case rules within a cluster. Since each session can only connect to one database, this is not a solution to every such situation, but it would address at least some such cases. Ian Lewis (www.mstarlabs.com) PS. To anyone who might know the answer: My Reply All to this group does not seem to join to the original thread. All I am doing is Reply All from Outlook. Is there something else I need to do to allow my responses to join the original thread? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On 25 December 2016 at 09:40, Lewis, Ian (Microstar Laboratories)wrote: > So, the current behavior already breaks many tools unless one accepts > that all symbols on the server are lower case. At root, based on reading > the threads you provided, this probably indicates defects in the tools, > rather than a problem with PostgreSQL. My reading of the standard text > quoted in various places is that any mixed case identifier returned from > the catalog has to be quoted to match in a query (whether you fold to > lower or upper case). Well tools that work with user-defined columns and make assumptions that they don't require quoting are just buggy. But the problem with configurable quoting rules is a bit different. Imagine your application later decides to depend on PostGIS. So you load the PostGIS extension and perhaps also some useful functions you found on Stack Overflow for solving some GIS problem you have. Those extensions will create objects and then work with those objects and may use CamelCase for clarity -- in fact I think PostGIS functions are documented as CamelCase. The PostGIS extensions might not work on your system with different case rules if they haven't been 100% consistent with their camelCasing, and the functions from StackOverflow would be even less likely to work. If there was some way to scope this setting lexically so it only affected code that's defined in specific place that might be safer. But I don't think things are currently organized that way. If you're only concerned with server-side functions it wouldn't be hard to have a specific pl language that was case sensitive though it might be tricky to do to pl/pgsql due to the way pl/pgsql depends on the sql parser. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On 1/2/17 12:25 PM, Robert Haas wrote: But, I can easily imagine a good number of people deciding they want mixed case on the server, and so quoting their identifiers. And, then deciding PostgreSQL is defective, rather than deciding their favorite administration or query tool is defective. Almost all of the tools I tried worked fine when I had all lower case symbols on the server. Based on observing the generated SQL, most of the tools that failed for me when I had mixed case symbols on the server would work against a case preserving mode in PostgreSQL. The tools generally pass through the catalog reported symbols without manipulation. Right. Tom's argument makes a lot of sense when you think about this from the point of view of someone writing extensions or tools that are designed to work with anybody's PostgreSQL instance. When you think about it from the point of view of a user wanting to write an application that can work with any of a number of databases, then your argument has a lot of merit to it. I'm not sure there's any way to split the baby here: tool authors will obviously prefer that PostgreSQL's behavior in this area be invariable, while people trying to develop portable database applications will prefer configurability. As far as I can see, this is a zero sum game that is bound to have one winner and one loser. I do wonder what the authors of those tools are using to test them. My guess is they're just hitting the default postgres database, which has no quotable identifiers. I'd find it useful to have a contrib module/tool that would create a full-blown test database that contains objects of every possible type, including using identifiers that need quotes. It'd also be useful to test leading and trailing whitespace. Having this would be useful for seeing what some of the more unusual objects look like in the catalog, and would simplify tests that I create for my extensions somewhat. If tool authors knew it existed they could use it for testing. It'd certainly be useful for testing things like pg_dump and event triggers. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On Sun, Dec 25, 2016 at 4:40 AM, Lewis, Ian (Microstar Laboratories)wrote: > I assume you are talking about general purpose tools that attempt to > interact with any database in any configuration. Obviously, a purpose > built tool, such as our own internal database applications, would be > designed only for the behavior of the databases it is intended to work > against. Right. > So, the current behavior already breaks many tools unless one accepts > that all symbols on the server are lower case. At root, based on reading > the threads you provided, this probably indicates defects in the tools, > rather than a problem with PostgreSQL. My reading of the standard text > quoted in various places is that any mixed case identifier returned from > the catalog has to be quoted to match in a query (whether you fold to > lower or upper case). > > But, I can easily imagine a good number of people deciding they want > mixed case on the server, and so quoting their identifiers. And, then > deciding PostgreSQL is defective, rather than deciding their favorite > administration or query tool is defective. Almost all of the tools I > tried worked fine when I had all lower case symbols on the server. Based > on observing the generated SQL, most of the tools that failed for me > when I had mixed case symbols on the server would work against a case > preserving mode in PostgreSQL. The tools generally pass through the > catalog reported symbols without manipulation. Right. Tom's argument makes a lot of sense when you think about this from the point of view of someone writing extensions or tools that are designed to work with anybody's PostgreSQL instance. When you think about it from the point of view of a user wanting to write an application that can work with any of a number of databases, then your argument has a lot of merit to it. I'm not sure there's any way to split the baby here: tool authors will obviously prefer that PostgreSQL's behavior in this area be invariable, while people trying to develop portable database applications will prefer configurability. As far as I can see, this is a zero sum game that is bound to have one winner and one loser. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On 25 Dec. 2016 14:49, "Tom Lane"wrote: No. This has been looked into repeatedly in the past, and we simply don't want to deal with it. Quite aside from the impact on the server (which would be extensive), it would break every nontrivial application, and force them all to try to deal with each possible folding behavior. The more behaviors there are, the worse that gets. Good point. That's been a source of pain for standard_conforming_strings and bytea_output. Many apps are also - sadly - hopelessly broken with regards to text encoding. (Including, unfortunately, PostgreSQL it's self, but more subtly than the obvious brain-dead behaviour of many apps). Tom makes a good point. Apps that need to care are better off consistently double quoting. Frankly if we were going to add any option at all I'd want one to force all unquoted identifiers to ERROR so apps could be easily validated not to depend on case folding behaviour at all. But even that has issues and is probably better done in tooling and static analysis or via a plugin.
Re: [HACKERS] Cluster wide option to control symbol case folding
Tom Lane [mailto:t...@sss.pgh.pa.us] wrote: > Quite aside from the impact on the server (which would be extensive), it would break every nontrivial application, and force them all to try to deal with each possible folding behavior. I have read through the various threads related to this issue that you supplied. And, it looks quite clear that a change is unlikely. Maybe it is even a bad idea, though personally I think it could prove worth the pain that you obviously anticipate. However, before fully dropping this issue, I do have one comment I would like to make (just for the record) on your statement that a change in the current behavior would break most non-trivial applications. I assume you are talking about general purpose tools that attempt to interact with any database in any configuration. Obviously, a purpose built tool, such as our own internal database applications, would be designed only for the behavior of the databases it is intended to work against. I have, over the past few months, tried quite a large number (10, maybe a few more than that) of general purpose tools against PostgreSQL looking for replacements for some of the tools we use against our old database server. And, almost none of them work well if I quote identifiers on the server. Almost all work perfectly well if I accept that all my symbols will be converted to lower case and do not quote the identifiers. Most fail - often not even in corner cases - when the catalog (properly) returns a mixed case symbol like WeeklySales. Very few of the tools I have tried seem to know to quote such a symbol to preserve the case when generating a query to send back to the server. Most of the tools I have tested work through ODBC, though a few connect directly to PostgreSQL. Both types of tools have exhibited similar issues. Maybe the ODBC connection is relevant, since it appears that Microsoft SQL Server does what our current server does and preserves case, at least in some modes. A good fraction of the tools I have tried, push their compatibility with SQL Server. So, the current behavior already breaks many tools unless one accepts that all symbols on the server are lower case. At root, based on reading the threads you provided, this probably indicates defects in the tools, rather than a problem with PostgreSQL. My reading of the standard text quoted in various places is that any mixed case identifier returned from the catalog has to be quoted to match in a query (whether you fold to lower or upper case). But, I can easily imagine a good number of people deciding they want mixed case on the server, and so quoting their identifiers. And, then deciding PostgreSQL is defective, rather than deciding their favorite administration or query tool is defective. Almost all of the tools I tried worked fine when I had all lower case symbols on the server. Based on observing the generated SQL, most of the tools that failed for me when I had mixed case symbols on the server would work against a case preserving mode in PostgreSQL. The tools generally pass through the catalog reported symbols without manipulation. I fully understand your concern to keep the number of modal behaviors to a minimum. And, the upper case folding looks to have a lot of side effects, and so is a whole lot harder to implement than I expected. Preserving case, which is what we actually want, by contrast still looks pretty easy, though it is clearly not standard compliant. I do not have any more to add on the subject. I just wanted to make this note that a good fraction of third party tools already fail unless one never quotes identifiers on PostgreSQL. Thank you for taking the time to reply to my original inquiry. Ian Lewis www.mstarlabs.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On Saturday, December 24, 2016 10:49 PM Tom Lane [mailto:t...@sss.pgh.pa.us] wrote: > No. This has been looked into repeatedly in the past, and we simply don't want to deal with it. Fair enough. We will not pursue the issue then. That is why I asked. Ian Lewis (www.mstarlabs.com) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On December 24, 2016 9:52 PM Craig Ringer [mailto:craig.rin...@2ndquadrant.com] wrote: > Personally I can see such an option being ok as an initdb-time setting or at CREATE DATABASE time. Case folding can know the current db from global context. > > It'd have to be fast though. Very fast. That seems reasonable. In fact, it is nicer to configure at the database level, rather than at the cluster level. I did not know it was possible to tell the database from global context. Since a connection only allows access to a single database, it makes sense that could be possible. A check of a single global configuration variable to switch between one of three handler paths would be very small compared with the character-by-character checks currently performed by downcase_identifier(), though, of course, the extra check would not be free. However, it would likely be faster to setup up the processing to call through a global pointer to one of three handler functions. On most Intel processors, at least, that extra pointer indirection costs little to nothing. The pointer could be set up during database connect (I do not know what I am talking about here, but there must be such a process somewhere). Presumably, the handler pointer would have to go into the global database descriptor whatever that is. Or, if you allow use of global objects for storing information about database scope run-time configuration, it could just be a function pointer stored with the handlers. The database initialization processing could call a setup function when it runs to select the correct handling for its configuration. The default would be the current downcase_identifier() handling. Does this seem like an approach that would meet your "Very fast" requirement? Ian Lewis (www.mstarlabs.com)
Re: [HACKERS] Cluster wide option to control symbol case folding
"Lewis, Ian \(Microstar Laboratories\)"writes: > Is there any chance that the PostgreSQL developers would accept a new > cluster wide configuration option to control how the system handles > symbol case folding? No. This has been looked into repeatedly in the past, and we simply don't want to deal with it. Quite aside from the impact on the server (which would be extensive), it would break every nontrivial application, and force them all to try to deal with each possible folding behavior. The more behaviors there are, the worse that gets. Pretty soon, every application is double-quoting every identifier out of sheer paranoia. Which is exactly the behavior you say you'd rather avoid --- but not only have you not avoided it, you've forced the entire Postgres ecosystem into it. The closest precedent that ever actually got into the server was the transaction-autocommit option that existed for awhile circa PG 7.3. Once we realized just how much complexity we were forcing on every application, we took that out again. If somehow we forgot that episode and agreed to take a case-folding behavioral change, I'm pretty sure the same dynamics would play out again. There's a lot of material on this in the archives. The latest substantive discussion I can find, which includes links to several previous investigations, is here: https://www.postgresql.org/message-id/flat/200807081925.40467.peter_e%40gmx.net regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster wide option to control symbol case folding
On 25 Dec. 2016 10:30 am, "Lewis, Ian (Microstar Laboratories)" < ile...@mstarlabs.com> wrote: Is there any chance that the PostgreSQL developers would accept a new cluster wide configuration option to control how the system handles symbol case folding? Probably not as a GUC (configuration option, like in PostgreSQL.conf). There's a somewhat established principle that GUCs should not change query semantics. Personally I can see such an option being ok as an initdb-time setting or at CREATE DATABASE time. Case folding can know the current db from global context. It'd have to be fast though. Very fast.
[HACKERS] Cluster wide option to control symbol case folding
Is there any chance that the PostgreSQL developers would accept a new cluster wide configuration option to control how the system handles symbol case folding? Currently PostgreSQL folds all un-quoted symbols to lower case. We would like to add a global configuration option with a name like symbol_casefold with settings to allow folding symbols to lc - lower case (default) uc - upper case none - no case folding USE CASE 1 The option we actually want for our own use is "none". Currently we have several large bodies of code that work with an SQL Anywhere backend. This server preserves case and - by a configuration option - performs case insensitive lookup on all symbols. We are moving to PostgreSQL for internal applications. We are also considering using PostgreSQL for the storage backend in a product. We use camel case for symbol names in all of our application and backend code (largely written in C++, Object Pascal, and Python). In a language like Pascal that ignores case, we still maintain consistent use of case in symbol names for the sake of human readers of the code. Where we make a connection from application code to a storage backend we would like to use exactly the same symbol name - including case - for the symbol in the backend as we use in the corresponding symbol in application code. For example, if we intend to read a value into a variable called FirstName we would like the corresponding field in the database to also be FirstName. The main reason we want this exact match is so that a human reader sees exactly the same thing in the two places. This makes it easier to see the connection between the two bodies of code. However, in places we also automate the connection between client symbols and server symbols, and in such a case it is useful, though certainly not necessary, to have an exact match. While we do not do so at present, where we use a database purely through code, we can likely add double quotes around all symbols in our generated SQL, which means we can have an exact match as we want under PostgreSQL as it currently behaves. However, in many cases we also access the same database backend through third party tools and ad hoc queries. If we double quote all of our symbols, which is what we have tried to do in our initial tests, then all such tools and manually written queries must also double quote all symbols. We use several tools and libraries that do not appear to have any way to properly quote symbols when they are obtained automatically from the schema. So, for example, if our reporting tool retrieves a field called FirstName from the database schema, it happily uses that name in its internally generated SQL. And, on PostgreSQL, of course this fails because FirstName in script is firstname at the server, which is not defined if we have double quoted the field name in the table definition. Since we do not have the code, we cannot work around this in any easy way. For ad hoc queries we can double quote all symbol references. But, this makes the queries noticeably harder for a person to read. And, I do not believe this is just a matter of "getting used to it". The quotes clutter the script, and that clutter makes the script fundamentally harder to understand. Maybe this is small, but anything that makes comprehension harder is a bad thing. Queries can be hard enough to understand without extra syntactic clutter. So, these considerations leave us the option of never quoting symbols when using PostgreSQL. This works everywhere we have tried it. But, it is pretty unattractive from the point of view of looking at the symbol names in the backend if they are to match exactly the symbol names we use in code. Using a different naming convention than we use everywhere else in our code (underscore separated all lower case symbol names, say) is not appealing either. We are very consistent in our symbol name handling in our code, and breaking our conventions in some relatively large section of our code is very unattractive. In addition, in many places our tools and code use schema supplied field names to form column titles in a table or in a caption on an edit box (etc.). The mixed case names are much nicer for this purpose than the folded names. For our purposes, these are our arguments for wanting control of how the server folds case. USE CASE 2 Even though we have no use for it, I have included the option "uc" because, in trying to determine whether PostgreSQL could support our desired behavior, I found a fairly large number of people who are coming from a different backend, such as Oracle (from what I see on the internet - no personal experience), that case fold similarly to PostgreSQL. However, instead of folding to lower case, it appears a number of other database servers fold to upper case. This leaves people who are moving from these other database systems with problems in their own code that they have to patch up to be able to make the port