Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
2011/10/31 Christopher Browne : > There is legitimate reason to reject this on the basis of nondeterminism. > > While we are surely obliged to "hold our noses" and support "SELECT *", as: > A) The SQL standard obliges us, and > B) People already use it a lot, > > Neither of those factors hold true for the EXCLUDING notion. So all things > are decidedly not equal. > > By all means I find it an interesting feature, but that shouldn't be > mistaken for necessarily being a desirable feature. > > I don't think I wish it. We're telling our developers not to use "select > *", and I don't think having "select * except " would change that policy, > beyond requiring us to waste time explaining : > +1 It can carry some new problems with cache - actually we don't need rebuild views after additing column to table or view Pavel > "No, we're not changing policy. The fact that PGDG added this to 9.2 does > *not* imply our policy was wrong." -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] myProcLocks initialization
On Sun, Oct 30, 2011 at 11:13 PM, Tom Lane wrote: > Robert Haas writes: >> I'd like to propose the attached patch, which initializes each >> PGPROC's myProcLocks just once at postmaster startup, rather than >> every time the PGPROC is handed out to a backend. These lists should >> always be emptied before a backend shuts down, so a newly initialized >> backend will find the lists empty anyway. Not reinitializing them >> shaves a few cycles. In my testing, it saves about 1% of the cost of >> setting up and tearing down a connection, which is not a ton, but a >> cycle saved is a cycle earned. > > That's not really enough to excite me, and the prospect of problems in > one session corrupting an unrelated later one is pretty scary from a > debugging standpoint. How about at least an Assert that the lock is in > a clean state? I can go for that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] myProcLocks initialization
Robert Haas writes: > I'd like to propose the attached patch, which initializes each > PGPROC's myProcLocks just once at postmaster startup, rather than > every time the PGPROC is handed out to a backend. These lists should > always be emptied before a backend shuts down, so a newly initialized > backend will find the lists empty anyway. Not reinitializing them > shaves a few cycles. In my testing, it saves about 1% of the cost of > setting up and tearing down a connection, which is not a ton, but a > cycle saved is a cycle earned. That's not really enough to excite me, and the prospect of problems in one session corrupting an unrelated later one is pretty scary from a debugging standpoint. How about at least an Assert that the lock is in a clean state? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
There is legitimate reason to reject this on the basis of nondeterminism. While we are surely obliged to "hold our noses" and support "SELECT *", as: A) The SQL standard obliges us, and B) People already use it a lot, Neither of those factors hold true for the EXCLUDING notion. So all things are decidedly not equal. By all means I find it an interesting feature, but that shouldn't be mistaken for necessarily being a desirable feature. I don't think I wish it. We're telling our developers not to use "select *", and I don't think having "select * except " would change that policy, beyond requiring us to waste time explaining : "No, we're not changing policy. The fact that PGDG added this to 9.2 does *not* imply our policy was wrong."
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
> The exact choice of keyword matters > a lot less than whether this can be done with out shift/reduce or > reduce/reduce conflicts. Which is the problem right now. See my other email. I'll post a diff tomorrow. Maybe if enough folks think is a feature worth having we can find a solution. My gram.y-fu is at its limits right now. eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 6:11 PM, Darren Duncan wrote: > Eric Ridge wrote: >> >> I don't actually like the term "EXCLUDING", but it conveys what's >> happening and is already defined as a keyword. I thought about >> "EXCEPT", but that doesn't work for obvious reasons, and "NOT" might >> just be confusing. > > How about "BUT"? > > Is that already in use by something? Its nice and short and conveys the > "except" meaning. > > And there is already precedent for using that word for this purpose. > > CJ Date already uses "ALL BUT" in his literature as a modifier to his > illustrative relation projection syntax to give the complementary > projection, like with "r{x,y}" vs "r{all but x,y}". > > Also, a more tenuous connection, Larry Wall likes "but" as logical modifier. Look, there's no good speculating about what might work without sitting down and editing gram.y. The exact choice of keyword matters a lot less than whether this can be done with out shift/reduce or reduce/reduce conflicts. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Mark Mielke wrote: On 10/30/2011 03:50 PM, Eric Ridge wrote: Changes of omission can break your code just as easily. I think I wasn't as clear as I intended. In many ways, I think use of "*" in the first place is wrong for code (despite that I do it as well). Therefore, "* EXCLUDING (...)" would also be wrong. It comes to "does the code know what it wants?" "select *" is not deterministic from a programming perspective. I understand what you're saying. However, we're stuck with * because it is in the standard and is widely used, and if we have * anyway, then the exclusion proposal is just an enhancement to that. So there is no reason to reject the complementary columns feature because of the problems with "select *"; you might as well argue to get rid of "select *". -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 7:46 PM, Mark Mielke wrote: > In the above case - maybe you don't want password - what about social > insurance number, credit card number, or any other private bit? The only way > to truly know you aren't accidentally pulling in fields you don't need or > want to unnecessarily expose on the wire - is to specifically list the > fields you DO want, which is precisely to not use "*" at all. A fairly useful characteristic case would be related to this, namely... select * excluding password from some_table; The sensitive bits of data are often fairly well enumerated, so it's a useful-ish idea to pull everything except that. The joys and challenges may be pretty well characterized by: create view sanitized_table as select * excluding password from some_table; Now, if someone running queries somewhere downstream is sufficiently lacking in clue that they don't know which columns they *do* want, then it seems dubious to me that letting them exclude the columns that they imagine that they know they *don't* want is actually valid. That is, if they don't know what they want (e.g. - the column set in a traditional sans-EXCLUDES query), then I'm rather suspicious that they aren't competent to build a proper EXCLUDES clause. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On 10/30/2011 03:50 PM, Eric Ridge wrote: On Sun, Oct 30, 2011 at 3:38 PM, Mark Mielke wrote: 2) Not deterministic (i.e. a database change might cause my code to break), Okay, I'm inventing a use-case here, but say you have a "users" table with various bits of metadata about the user, including password. Maybe, regardless of database changes, you never want the password column returned: SELECT * EXCLUDING (password) FROM tbl_users; Changes of omission can break your code just as easily. I think I wasn't as clear as I intended. In many ways, I think use of "*" in the first place is wrong for code (despite that I do it as well). Therefore, "* EXCLUDING (...)" would also be wrong. It comes to "does the code know what it wants?" In the above case - maybe you don't want password - what about social insurance number, credit card number, or any other private bit? The only way to truly know you aren't accidentally pulling in fields you don't need or want to unnecessarily expose on the wire - is to specifically list the fields you DO want, which is precisely to not use "*" at all. A particular area that I don't like "*" is that my code may make an assumption about the exact field names, or the field order that comes out. If this is explicitly specified, then it will survive "ALTER TABLE", or a restore of the table with columns in a different order, or a replacement of the table with a view. However, if I use "*", then my code is likely to fail in any of these cases, and possibly fail in some unexpected way. For example, in a language such as Perl with DBI returning a hash, I may accidentally assume that the field is always undef. It might even pass some designer testing if the value is usually NULL = undef, and I fail to simulate the case where it is not. "select *" is not deterministic from a programming perspective. -- Mark Mielke -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] myProcLocks initialization
I'd like to propose the attached patch, which initializes each PGPROC's myProcLocks just once at postmaster startup, rather than every time the PGPROC is handed out to a backend. These lists should always be emptied before a backend shuts down, so a newly initialized backend will find the lists empty anyway. Not reinitializing them shaves a few cycles. In my testing, it saves about 1% of the cost of setting up and tearing down a connection, which is not a ton, but a cycle saved is a cycle earned. Of course, we have a few outstanding reports, like this one from Dave Gould, indicating that maybe we have a bug in there somewhere: http://archives.postgresql.org/message-id/20110822073131.gc3...@sonic.net ...but in that case it seems to me that this doesn't make anything worse than it already is. If the myProcLocks pointers are pointing to random garbage, we're just kidding ourselves whatever we do; the system is screwed, and we ought to PANIC, and anything we do here is laughably inadequate. OTOH, if it just so happens that a backend found a sneaky way through the exit path that doesn't involve calling LockReleaseAll(), then overwriting the shm queue pointers removes our last hope that the next backend can clean up the mess. I'm not putting a lot of faith in that actually working, just saying that the current code doesn't seem to be accomplishing anything in the robustness department. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company init-myproclocks-once.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 3:17 PM, Tom Lane wrote: > "fails to not break anything else" category. >From what I've seen watching this list, you're usually right. :) It looks like it's perfectly okay to write: SELECT pg_class.* AS foo FROM pg_class; (with or without the AS) I don't know what the above actually means, but it stops SELECT pg_class.* EXCLUDING(...) dead in its tracks. So, I'd have to break a syntax (albeit silly?) that currently works to support this. :( eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Eric Ridge wrote: I don't actually like the term "EXCLUDING", but it conveys what's happening and is already defined as a keyword. I thought about "EXCEPT", but that doesn't work for obvious reasons, and "NOT" might just be confusing. How about "BUT"? Is that already in use by something? Its nice and short and conveys the "except" meaning. And there is already precedent for using that word for this purpose. CJ Date already uses "ALL BUT" in his literature as a modifier to his illustrative relation projection syntax to give the complementary projection, like with "r{x,y}" vs "r{all but x,y}". Also, a more tenuous connection, Larry Wall likes "but" as logical modifier. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 4:43 PM, Darren Duncan wrote: > SELECT foo.* EXCLUDING foo.x, bar.* EXCLUDING bar.y, baz.z, (a+b) AS c FROM > ... > Is that where you're going with this? Yes. It's basically a modifier to the star that immediately precedes it. In order to support excluding multiple columns, it needs parens: SELECT foo.* EXCLUDING (foo.x, foo.y), bar.* EXCLUDING (bar.y), baz.z, (a+b) AS c but yes, that's what I'm thinking. I think doing this will require more changes to the grammar than I had first thought because there'd be no point in supporting: SELECT foo.* EXCLUDING (foo.* EXCLUDING foo.y) FROM ... It looks like the above would be implicitly allowed without a bit of extra work. But, if you've got a complex query consisting of a few joins, it'd be nice to say: SELECT * EXCLUDING (table1.*, table2.x) FROM table1 INNER JOIN table2 ... > If so, I think that would make the feature even more valuable and more > syntactically clean than I had previously thought. I don't actually like the term "EXCLUDING", but it conveys what's happening and is already defined as a keyword. I thought about "EXCEPT", but that doesn't work for obvious reasons, and "NOT" might just be confusing. eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Eric Ridge wrote: On Sun, Oct 30, 2011 at 4:03 PM, Tom Lane wrote: That's just a gut feeling, I've not tried it ... but the proposed syntax sure looks a lot like a call to a function named EXCLUDING. I think what makes it okay is that its new use is only defined to immediately follow an asterisk in the "target_el" production. If you look at gram.y:11578 (from git HEAD), I was thinking this: That's an interesting thought. I had previously envisioned EXCLUDING to appear exactly once in the SELECT list, where it functions as a column analogy to EXCEPT for rows. So you'd say: SELECT EXCLUDING FROM ... But then your comment makes me think that your EXCLUDING actually could have a much tighter scope, and as a result might appear multiple times, like AS can: SELECT foo.* EXCLUDING foo.x, bar.* EXCLUDING bar.y, baz.z, (a+b) AS c FROM ... So if you do this, EXCLUDING has syntactic precedence similar to AS, and while AS is only valid not following a *, EXCLUDING is only valid following a *, and so EXCLUDING directly modifies a single * and not the SELECT list in general. Is that where you're going with this? If so, I think that would make the feature even more valuable and more syntactically clean than I had previously thought. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On 10/30/2011 04:03 PM, Tom Lane wrote: Please correct me if I'm wrong, but if we choose the word carefully (which is why I chose EXCLUDING), I think we're okay? EXCLUDING is already defined as an "ordinary key word". Yeah, it's unreserved so it doesn't break use of the same name for columns or functions. I'm not sure that you can make the syntax work the way you suggest without bumping up its reserved-ness level. That's just a gut feeling, I've not tried it ... but the proposed syntax sure looks a lot like a call to a function named EXCLUDING. Adding this rule doesn't appear to cause any complications: target_el: '*' EXCLUDING '(' ')' I'm not saying we need to do this, although there have been times when I might have liked it, but I suspect it would not cause us any grammar problems at least. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 4:03 PM, Tom Lane wrote: > That's just a gut feeling, I've not tried it ... but the proposed > syntax sure looks a lot like a call to a function named EXCLUDING. I think what makes it okay is that its new use is only defined to immediately follow an asterisk in the "target_el" production. If you look at gram.y:11578 (from git HEAD), I was thinking this: | a_expr { $$ = makeNode(ResTarget); $$->name = NULL; $$->indirection = NIL; $$->val = (Node *)$1; $$->location = @1; } +| '*' EXCLUDING '(' columnref_list ')' +{ +/** make magic happen */ +} | '*' { ColumnRef *n = makeNode(ColumnRef); n->fields = list_make1(makeNode(A_Star)); n->location = @1; $$ = makeNode(ResTarget); $$->name = NULL; $$->indirection = NIL; $$->val = (Node *)n; $$->location = @1; } And it looks like something similar would be necessary in the "indirection_el" production, around line 11478. But that might be overly simplistic (and wrong). eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Eric Ridge writes: > On Sun, Oct 30, 2011 at 3:38 PM, Mark Mielke wrote: >> 2) Not deterministic (i.e. a database change might cause my code to break), > Okay, I'm inventing a use-case here, but say you have a "users" table > with various bits of metadata about the user, including password. > Maybe, regardless of database changes, you never want the password > column returned: SELECT * EXCLUDING (password) FROM tbl_users; Well, here you're not only inventing a use-case, but you're making a lot of contrary-to-fact-and-to-SQL-spec assumptions about when the * notation gets expanded. This thing wouldn't be useful that way in views. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Eric Ridge writes: > On Sun, Oct 30, 2011 at 3:17 PM, Tom Lane wrote: >> In particular, I don't think it's acceptable to introduce a >> new reserved keyword for this --- that would fall under the "fails to >> not break anything else" category. > Please correct me if I'm wrong, but if we choose the word carefully > (which is why I chose EXCLUDING), I think we're okay? EXCLUDING is > already defined as an "ordinary key word". Yeah, it's unreserved so it doesn't break use of the same name for columns or functions. I'm not sure that you can make the syntax work the way you suggest without bumping up its reserved-ness level. That's just a gut feeling, I've not tried it ... but the proposed syntax sure looks a lot like a call to a function named EXCLUDING. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 3:38 PM, Mark Mielke wrote: > Stupid question: > > Is this just a display thing? Well, it's a "display thing" as much as any SELECT statement (especially via psql) is a "display thing". It's more like "I want all 127 columns, except the giant ::xml column, and I'm too lazy to type each column name out by hand". > Or does this have impact for things such as COUNT(*) vs COUNT(1)? If it does, it's broken. > Is it "like a view, but on the fly"? Naw, it's just short-hand for omitting columns from the output target list. As I'm envisioning the feature, it would be SQL-level syntax, so you could bake it into a view, but... > 1) Not standards compliant, Sure, no doubt. I think the "::typename" syntax is not standards compliant either, and I bet I'm not the only Postgres user to use that every day. But I secretly tend to agree, I wouldn't necessarily want to use this in production/application code. > 2) Not deterministic (i.e. a database change might cause my code to break), Okay, I'm inventing a use-case here, but say you have a "users" table with various bits of metadata about the user, including password. Maybe, regardless of database changes, you never want the password column returned: SELECT * EXCLUDING (password) FROM tbl_users; Changes of omission can break your code just as easily. > 3) Working around a problem that maybe shouldn't exist in the first place? > It's > a like buying a rug, so that nobody sees the scratches on the floor. Sometimes, rugs are cheaper than new floors. eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Stupid question: Is this just a display thing? Or does this have impact for things such as COUNT(*) vs COUNT(1)? Is it "like a view, but on the fly"? I'm found myself in the *occasional* (certainly not daily!) situation where such a feature might be useful, but each time I wonder about if there should be a better way, I realize that if I ever saw such a thing in production code it would be the first code I shot down. 1) Not standards compliant, 2) Not deterministic (i.e. a database change might cause my code to break), 3) Working around a problem that maybe shouldn't exist in the first place? It's a like buying a rug, so that nobody sees the scratches on the floor. I can see the contention. :-) If it existed, I would occasionally use it from the command line. I'm thinking three times a year. Certainly not daily. Heck, if it's more characters to type (than "select * ...") the number of times I would bother typing it are quite short. :-) Cheers, -- Mark Mielke -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Tom Lane wrote: Darren Duncan writes: The real question to ask ourselves is, if Eric Ridge is willing to do all the work to implement this feature, and the code quality is up to the community standards and doesn't break anything else, then will the code be accepted? It's entirely possible that it will get bounced on standards-compliance grounds. In particular, I don't think it's acceptable to introduce a new reserved keyword for this --- that would fall under the "fails to not break anything else" category. regards, tom lane Well then we come up with a (SQL-level) syntax for the feature that doesn't introduce new reserved keywords. As I said before, the important thing is to have the feature, and that the exact syntax is the main point to discuss. Postgres already has a number of syntactic features that aren't in the SQL standard and coexist, and so we add one of those. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 3:17 PM, Tom Lane wrote: > > It's entirely possible that it will get bounced on standards-compliance > grounds. And that's a perfectly valid reason to reject it. > In particular, I don't think it's acceptable to introduce a > new reserved keyword for this --- that would fall under the "fails to > not break anything else" category. Please correct me if I'm wrong, but if we choose the word carefully (which is why I chose EXCLUDING), I think we're okay? EXCLUDING is already defined as an "ordinary key word". And it's new use in this situation seems to be completely unambiguous, such that you'd still be able to use "excluding" everywhere you already could. You know more about the grammar than I (or probably most anyone), so I'm wondering why you think it might need to be a "reserved keyword"? Alternatively, would it be okay to use an existing reserved keyword? eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
David Wilson wrote: On Sun, Oct 30, 2011 at 1:10 AM, Darren Duncan wrote: The SQL level is exactly the correct and proper place to do this. Its all about mathematical parity. That is the primary reason to do it. - "SELECT *" gives you a whole set. - "SELECT foo, bar" gives you a subset of that. - "SELECT ALL BUT foo, bar" gives you the complementary subset. That's not actually entirely true given the usual SQL (and mathematical) meaning of "set". This feature relates to the set of attributes returned regarding elements of the returned set, not the set itself- the actual returned set is identical regardless of the column-specifier formulation. Claiming this as an SQL mathematical purity issue is a bit disingenuous, as SQL set manipulation takes place at the member level rather than the attribute level- SQL is otherwise quite explicit about requiring explicit listings of the attributes that the client is interested in regarding a returned set of member rows. SQL rowsets/tables are distinctly sets across two dimensions, both across columns and across rows. You have sets of rows and sets of columns at the same time. And it is useful to slice the rowset/table along either or both dimension, wherein you produce a subset in that dimension. We presently have the terse syntax for specifying both the subset we do want and the subset we want the complement of, for rows, but not for columns. It is true that columns and rows are not the same, but they are both still sets. Arguing against this is like arguing against a subtraction operator, because we can emulate using addition plus negation, or saying subtraction should just be a special filter in a client app. That would be true if this was an argument against "WHERE" or "EXCEPT". Column specification and row specification are very different and cannot be conflated. Well I raised the WHERE/EXCEPT analogy in my initial reply. Not conflating rows and columns is why we have different syntax to work with them. That's not to say this proposal is without merit, merely that your arguments for it are poorly founded and not particularly relevant. I disagree, but regardless, other arguments have been made for the feature that are more based in utility, and I agree with those, how having the feature can save a lot of users a lot of work. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 2:54 PM, Robert Haas wrote: > OTOH, I'm slightly afraid of how much work it would take to implement > this properly. I think first, the A_Star node struct will need to be expanded to include a List of qualified column references to exclude. From there, the "target_el" rule in gram.y will need to be expanded to support a syntax like: '*' EXCLUDING '(' columnref_list ')' { ... } I also think that the "indirection_el" rule will need to be expanded to support something similar. Together, that would let us write both: SELECT * EXCLUDING(table1.col1, table2.col1) FROM ... and SELECT table.* EXCLUDING(col1, col2) FROM ... or even SELECT * EXCLUDING(table1.col1), table2.* EXCLUDING(col1) FROM ... I think changing the "indirection_el" rule might have an impact to OLD/NEW, but I'm not sure. Is it legal to write OLD.*, and if so, would you also want to write OLD.* EXCLUDING (...) in those cases? I think this only applies to RULES or SQL-type trigger functions, but not pl/pgsql? Then it looks like touching various functions in src/backend/nodes/*.c to do the right things with the new exclusion list field in A_Star. I haven't traced through everything yet, but it looks like if the various places in src/backend/nodes/*.c are done correctly, then regurgitating a view definition or whatnot that includes this syntax will be automatic (i.e., no specific support required for pg_dump)? Anyways, at first I thought it would be about 8hrs of work just to get something working. Maybe it's more like 20, but even still, it seems fairly straightforward. eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Darren Duncan writes: > The real question to ask ourselves is, if Eric Ridge is willing to do all the > work to implement this feature, and the code quality is up to the community > standards and doesn't break anything else, then will the code be accepted? It's entirely possible that it will get bounced on standards-compliance grounds. In particular, I don't think it's acceptable to introduce a new reserved keyword for this --- that would fall under the "fails to not break anything else" category. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
I think the loose consensus I've picked up from people in this thread is that the ability to tersely specify a complementary subset of columns is something that is simple enough and wouldn't hurt us to have it but that its utility is limited such that a lot of people wouldn't want to do the work to implement it either. Eric B. Ridge wrote: Anyways, looks like it might be an uphill battle to get the idea accepted (let alone any code I write!), but I ain't givin' up just yet. I think this is the bottom line here. The real question to ask ourselves is, if Eric Ridge is willing to do all the work to implement this feature, and the code quality is up to the community standards and doesn't break anything else, then will the code be accepted? Its one thing to argue whether a new small feature is useful enough to go to the trouble to implement, and its another thing to argue whether that feature is harmful enough to reject a free working implementation (of otherwise conforming code quality) from someone who has already gone to the trouble to implement it. Eric, if you want to implement this, I say more power to you, and I will use it. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Oct 30, 2011, at 11:54 AM, Robert Haas wrote: >> If you know that you want all fields except X, Y and Z, why should you be >> forced to manually cut and paste all the other fields into the view >> definition? That's wasted time and effort that could better be spent >> elsewhere. > > I agree. This isn't something I would recommend that people do in the > queries they embed in their application, but for ad-hoc queries it > comes up all the time. It can also be very useful for generating queries that need to omit rows, such as in PL/pgSQL functions that use EXECUTE, not to mention client code. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 2:15 PM, Jim Nasby wrote: > If you know that you want all fields except X, Y and Z, why should you be > forced to manually cut and paste all the other fields into the view > definition? That's wasted time and effort that could better be spent > elsewhere. I agree. This isn't something I would recommend that people do in the queries they embed in their application, but for ad-hoc queries it comes up all the time. OTOH, I'm slightly afraid of how much work it would take to implement this properly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Sun, Oct 30, 2011 at 10:58 AM, Tom Lane wrote: > You are misreading the old code. The inner loop increments the last > byte, checks for success, and if it hasn't produced a greater string > then it loops around to increment again. Ugh. You're right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Oct 30, 2011, at 12:53 AM, Joshua D. Drake wrote: > > If it is quite regular I would actually argue two things: > > 1. Use a view > 2. You haven't normalized correctly > > I am not trying to be a pedantic zealot or anything but those would be my > arguments against. You know how general database work goes. For every situation where you can make a view or every situation where you should normalize, there's at least one corresponding situation where you can't. All database systems, Postgres included, give us plenty of ways to do things "wrong", many of which are much worse than this little idea. I guess I'd like for everyone to evaluate the idea on the value it could provide to Postgres and its users, rather than imposing philosophical/religious beliefs about "correct" database schema design. I'm regularly tasked with debugging queries, analyzing, exporting, and otherwise transforming data into something a customer wants. I'd use something like "SELECT * EXCLUDING (…)" on a *daily* basis. I'm sick and tired of stuff like: psql -tA db -c "\d table" | cut -f1 -d\| | grep -v col | tr \\n , just to exclude one column from a list of maybe 100. And if multiple tables are involved in the query, it just gets that much more complicated. I'd rather do: SELECT * EXCLUDING (x.col) FROM ( ) x; Then, once I verify it's all good: COPY ( SELECT * EXCLUDING (x.col) FROM ( ) x ) TO '/tmp/foo.out' WITH CSV; Anyways, looks like it might be an uphill battle to get the idea accepted (let alone any code I write!), but I ain't givin' up just yet. eric PROPRIETARY AND COMPANY CONFIDENTIAL COMMUNICATIONS The information contained in this communication is intended only for the use of the addressee. Any other use is strictly prohibited. Please notify the sender if you have received this message in error. This communication is protected by applicable legal privileges and is company confidential. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Oct 29, 2011, at 11:53 PM, Joshua D. Drake wrote: > On 10/29/2011 03:26 PM, Eric Ridge wrote: >> >> Would y'all accept a patch that extended the "SELECT *" syntax to let >> you list fields to exclude from the A_Star? >> >> Quite regularly I'll be testing queries via psql and want to see all >> the columns from a fairly wide table except maybe a giant text or xml >> column. A syntax like: >> >> SELECT * EXCLUDING (big_col1, big_col2) FROM foo; >> > > If it is quite regular I would actually argue two things: > > 1. Use a view If you know that you want all fields except X, Y and Z, why should you be forced to manually cut and paste all the other fields into the view definition? That's wasted time and effort that could better be spent elsewhere. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ecpg-related build failure with make 3.82
Hi. I tested that make 3.82 - win32 version hasn't any new functionalities for example .ONESHELL. -- pasman -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 1:51 PM, Eric B. Ridge wrote: > eric > > PROPRIETARY AND COMPANY CONFIDENTIAL COMMUNICATIONS my bad. Switched email accounts without realizing. :( eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
Robert Haas writes: > On Sat, Oct 29, 2011 at 4:36 PM, Tom Lane wrote: >> Oh! You are right, I was expecting it to try multiple characters at the >> same position before truncating the string. This change seems to have >> lobotomized things rather thoroughly. What is the rationale for that? >> As an example, when dealing with a single-character string, it will fail >> altogether if the next code value sorts out-of-order, so this seems to >> me to be a rather large step backwards. > On this point I believe you are still confused. The old code tried > one character per position, and the new code tries one character per > position. Nothing has been lobotomized in any way. No, on this point you are flat out wrong. Try something like select ... where f1 like 'p%'; in tt_RU locale, wherein 'q' sorts between 'k' and 'l'. The old code correctly found that 'r' works as a string greater than 'p'. The new code fails to find a greater string, because it only tries 'q' and then gives up. This results in a selectivity estimate much poorer than necessary. Since the stated purpose of this patch is to fix some corner cases where the code fails to find a greater string, I fail to see why it's acceptable to introduce some other corner cases that weren't there before. > The difference is > that the old code used a "guess and check" approach to generate the > character, so there was an inner loop that was trying to generate a > character (possibly generating various garbage strings that did not > represent a character along the way) and then, upon success, checked > the sort order of that single string before truncating and retrying. You are misreading the old code. The inner loop increments the last byte, checks for success, and if it hasn't produced a greater string then it loops around to increment again. > The fact that we haven't gotten any complaints before suggests that this > actually works decently well as it stands. Well, that's true of the old algorithm ;-) I had likewise thought of the idea of trying some fixed number of character values at each position, but it's unclear to me why that's better than allowing an encoding-specific behavior. I don't believe that we could get away with trying less than a few dozen values, though. For example, in a situation where case sensitivity is relevant, you might need to increment past all the upper-case letters to get to a suitable lower-case letter. I also think that it's probably useful to try incrementing higher-order bytes of a multibyte character before giving up --- we just can't afford to do an exhaustive search. Thus my proposal to let the low-order bytes max out but not cycle. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add socket dir to pg_config..?
On Sat, Oct 29, 2011 at 08:28:57PM +, Mr. Aaron W. Swenson wrote: > > If /tmp is the only decent place where to put the socket file on Unix > > when security and other concerns are considered, then sure, making > > distro life difficult is a good thing to do. But then let's take it to > > the FHS that debian and ubuntu are implementing, AFAIUI. > In Gentoo, we change the socket directory to /var/run/postgresql via > pg_config_manual.h. However, I'm not too terribly interested in pg_config > outputting the directory location. Frankly, I'm not seeing the difference between the socket directory and the "listen_addresses" option. When connecting you can specify the socket directory to use via the "host" option. It might even be more logical to be able to specify multiple directories. Given we support multiple listen sockets I can't imagine it would require much code. (And yes, just today I ran into the issue of hardcoded paths. If the directory it points to is not world writable then you've limited the users who can run the postgres server. Which is an unnecessary restriction imho). Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] So, is COUNT(*) fast now?
Sorry, I forgot to include the version used & some information about my setup: PostgreSQL version: Git HEAD as of: Date: Fri Oct 28 21:18:36 2011 -0400 Commit: 51eba98cf4595e90730dedd9305da8aa84b649ee Compiled with defaults, (only change --with-pgport = 5431). I used default settings, shared_buffers size is 24MB. The system is Linux Mint Debian edition (kernel 3.0.0, gcc 4.6.1). The interesting parts about my hardware were in the original post. - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So, is COUNT(*) fast now?
Quoting Robert Haas: """ I tried this on my MacBook Pro this morning, using pgbench -i -s 500 to create a database about 7.5GB in size, and then using "SELECT sum(1) FROM pgbench_accounts" as a test query, on a build WITHOUT --enable-cassert. This machine has 4GB of memory, and I set shared_buffers = 400MB. (No, I'm not sure whether that's the optimal setting for shared_buffers for this machine.) """ I did some tests where I tried to compare the effect of having the index ordered tuples not be in the same order they are in the base table. The idea is to test what effect accessing the VM map randomly as opposed to sequential order has on performance. I suspect the above test access the VM in order (the accounts table is effectively clustered on the index used in the test). I might be mistaken here. My test setup is this: drop table if exists test; drop table if exists test2; create unlogged table test /* with (fillfactor = 10) */ as select generate_series(0, 20*1000*1000) as id; create index idx1 on test(id); vacuum test; create unlogged table test2 /* with (fillfactor = 10) */ as (select * from test order by random()); create index idx2 on test2(id); vacuum test2; Table size is around 600MB, index size is around 350MB and VM on-disk size is 16kB with default fillfactor. With fillfactor = 10, the VM size is 104 KB, and table size is around 6GB. The index size is the same. Results for the randomly ordered table: # select count(*) from test2; 14822.045 ms 14826.253 ms 14815.450 ms Results for the effectively clustered table: # select count(*) from test; 11761.890 ms 11767.926 ms 11810.900 ms Now, this test still has the benefit of fitting the VM easily into the L1 cache. Next, I did a ugly hack to get the table size large enough so that the VM will trash the L1 cache while still having somewhat reasonable test setup creation time. My harware is old, 1GB of memory, processor is Genuine Intel(R) CPU L2400 @ 1.66GHz. The L1 data cache size is 32kB on my. The hack is to simply set fillfactor to 10. The VM size is now 104kB, the table size is about 6.3 GB while the index size is still the same as in above test. Results for the randomly ordered table: # select count(*) from test2; 21606.683 ms 21829.063 ms 21637.434 ms Results for the effectively clustered table: # select count(*) from test; 11714.663 ms 11449.264 ms 11658.534 ms Now, the next step would be to trash the L2 cache (20GB table size should do this on Sandy Bridge, where L2 cache is 256KB). I don't have hardware to do that test. It is worth noting that the L2 cache is shared on Sandy Bridge, so it is likely that an index-only scan of a large enough table would slow down other processes, too. Without tests this is only FUD, though. The test would be to scan a 20GB table's index repeatedly in one process, and see how it affects standard in-memory pgbench results for other processes. Compare this with doing the same with a sequential scan process. Lessons learned (or what I learned, at least): - Clustering is important for index only scans. Picking a clustered index over non-clustered index will have a big performance effect. - Large table index-only scans are going to be more expensive compared to sequential scan than what pgbench accounts tests suggests. I assume that the accounts table is effectively clustered on the index used. I haven't verified this. - There is the possibility that index-only scans will trash the caches for other processes, too. Not tested, though. I am sure these results will vary significantly based on hardware used. I am also notorious for screwing up benchmarks, so verifying these results is recommended. You will need around 16GB of disk space for the fillfactor = 10 test. I would recommend you have more than 1GB of memory, otherwise creating the test setup can take some time... - Anssi Kääriäinen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sun, Oct 30, 2011 at 1:10 AM, Darren Duncan wrote: > The SQL level is exactly the correct and proper place to do this. > > Its all about mathematical parity. That is the primary reason to do it. > > - "SELECT *" gives you a whole set. > - "SELECT foo, bar" gives you a subset of that. > - "SELECT ALL BUT foo, bar" gives you the complementary subset. That's not actually entirely true given the usual SQL (and mathematical) meaning of "set". This feature relates to the set of attributes returned regarding elements of the returned set, not the set itself- the actual returned set is identical regardless of the column-specifier formulation. Claiming this as an SQL mathematical purity issue is a bit disingenuous, as SQL set manipulation takes place at the member level rather than the attribute level- SQL is otherwise quite explicit about requiring explicit listings of the attributes that the client is interested in regarding a returned set of member rows. > > Arguing against this is like arguing against a subtraction operator, because > we can emulate using addition plus negation, or saying subtraction should > just be a special filter in a client app. That would be true if this was an argument against "WHERE" or "EXCEPT". Column specification and row specification are very different and cannot be conflated. That's not to say this proposal is without merit, merely that your arguments for it are poorly founded and not particularly relevant. -- - David T. Wilson david.t.wil...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Pavel Stehule wrote: 2011/10/30 Darren Duncan : I agree that this feature would be quite useful and should be included in SQL. The exact syntax is less of an issue, but just the ability to cleanly say "select all columns except for these". I have in fact argued for the same feature in the past. If you want to and can implement this feature then more power to you. I'll look forward to it being in Pg 9.2. I think then the only discussion point should be what (terse) syntax to use for it, not whether the feature should exist at all. Arguing against this feature is like arguing against supporting "where not()" or "except" or "not in". One should be able to do complements not only of rows but of columns too. Basic good language design. My practice speaks so this is not true - I don't know only bad designed projects or very bad designed projects that needs too. I don't see any reason why do it on SQL level. It can sence only in psql as same special filter - if we would to enhace a report features there. The SQL level is exactly the correct and proper place to do this. Its all about mathematical parity. That is the primary reason to do it. - "SELECT *" gives you a whole set. - "SELECT foo, bar" gives you a subset of that. - "SELECT ALL BUT foo, bar" gives you the complementary subset. There's a variety of uses for specifying complementary subsets, and when the clean syntax is available, people will start using it for cleaner code, even if they previously had workarounds. The complementary subset should be implemented in exactly the same place and manner as the forward subset, on the SQL level. Arguing against this is like arguing against a subtraction operator, because we can emulate using addition plus negation, or saying subtraction should just be a special filter in a client app. -- Darren Duncan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers