Re: [HACKERS] Possible patch for better index name choosing
I wrote: Attached is a WIP patch for addressing the problems mentioned in this thread: http://archives.postgresql.org/pgsql-hackers/2009-12/msg01764.php ... There is one thing that is not terribly nice about the behavior, which is that CREATE TABLE LIKE INCLUDING INDEXES is unable to generate smart names for expression indexes; ... The reason for this is that the patch depends on FigureColname which works on untransformed parse trees, and we don't have access to such a tree when copying an existing index. There seem to be three possible responses to that: ... 3. Implement a separate FigureIndexColname function that works as much like FigureColname as it can, but takes a transformed parse tree. I fooled around with this solution and decided that it is a lot messier than it's worth. In the first place, we can't make a FigureColname-like function that just takes a transformed tree: there is no way to interpret Vars without some context. You need at least a table OID, and more than that if you'd like to handle cases like multiple-relation expressions or non-RELATION RTEs. For the case at hand of index expressions, a table OID would be enough, but that doesn't leave much room for imagining the function could be used for anything else in future. Worse, for the problematic case (CREATE TABLE LIKE) we actually do not have a table OID because the target table doesn't exist yet. We could finesse that by passing the source table's OID instead, but that seems pretty klugy itself. In the second place, the number of corner cases where we'd generate output different from FigureColname is much greater than I realized. As an example, if foo is a type name then foo(x) and x::foo produce the same parsed tree, but FigureColname will treat them differently. Seeing that CREATE TABLE LIKE doesn't try to reproduce the source table's index names anyway, I'm inclined to just go with the patch as-is and not try to make it handle this one case nicely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible patch for better index name choosing
On mån, 2009-12-21 at 00:03 -0500, Tom Lane wrote: Well, we could tamp down the risks considerably if we undid my point (1), namely to still consider only the first index column when generating a name. I think putting all the column names into the index names instead of only the first is a significant improvement that should be kept. If we can't do it properly in some cases, we should punt in some obvious way, not pretend to do the correct thing but actually omit some bits. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible patch for better index name choosing
Peter Eisentraut pete...@gmx.net writes: On mån, 2009-12-21 at 00:03 -0500, Tom Lane wrote: Well, we could tamp down the risks considerably if we undid my point (1), namely to still consider only the first index column when generating a name. I think putting all the column names into the index names instead of only the first is a significant improvement that should be kept. Yeah, I think so too. It's well worth any risk of application incompatibility --- we make much bigger changes in every major release without blinking. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Possible patch for better index name choosing
Attached is a WIP patch for addressing the problems mentioned in this thread: http://archives.postgresql.org/pgsql-hackers/2009-12/msg01764.php The main things that it does are (1) consider all index columns, not just the first one as formerly; and (2) try to generate a usable name for index expression columns, rather than just ignoring them which was the effective behavior formerly. There are several changes in the regression test outputs, mostly as a result of choice (1). I've not bothered to update the expected files yet but just attached the output diffs to show what happens. There is one thing that is not terribly nice about the behavior, which is that CREATE TABLE LIKE INCLUDING INDEXES is unable to generate smart names for expression indexes; it falls back to expr, as for example in regression=# create table foo (f1 text, exclude (lower(f1) with =)); NOTICE: CREATE TABLE / EXCLUDE will create implicit index foo_lower_exclusion for table foo CREATE TABLE regression=# create table foo2 (like foo including all); NOTICE: CREATE TABLE / EXCLUDE will create implicit index foo2_expr_exclusion for table foo2 CREATE TABLE The reason for this is that the patch depends on FigureColname which works on untransformed parse trees, and we don't have access to such a tree when copying an existing index. There seem to be three possible responses to that: 1. Decide this isn't worth worrying about and use the patch as-is. 2. Change FigureColname to work on already-transformed expressions. I don't care for this idea much, for two reasons. First, FigureColname would become significantly slower (eg, it would have to do catalog lookups to resolve names of Vars, instead of just pulling the name out of a ColumnRef), and this is objectionable considering it's part of the required parsing path for even very simple commands. Second, there are various corner cases where we'd get different results, which would likely break applications that are expecting specific result column names from given queries. 3. Implement a separate FigureIndexColname function that works as much like FigureColname as it can, but takes a transformed parse tree. This fixes the LIKE case and also removes the need for the iexprname field that the attached patch adds to IndexElem. I think it largely overcomes the two objections to idea #2, since an extra few lookups during index creation are hardly a performance problem, and exact application compatibility shouldn't be an issue here either. It's a bit ugly to have to keep two such functions in sync though. I'm not real sure whether to go with the patch as-is or use idea #3. It seems to depend on how annoyed you are by the LIKE behavior. A different consideration is whether it's really a good idea to be messing with default index names at all. As illustrated in the attached regression diffs, this does impact the error messages returned to applications for unique-index failures. I don't think this is a serious problem across a major version update, but maybe someone thinks differently. Comments? regards, tom lane Index: src/backend/bootstrap/bootparse.y === RCS file: /cvsroot/pgsql/src/backend/bootstrap/bootparse.y,v retrieving revision 1.101 diff -c -r1.101 bootparse.y *** src/backend/bootstrap/bootparse.y 7 Dec 2009 05:22:21 - 1.101 --- src/backend/bootstrap/bootparse.y 21 Dec 2009 02:47:04 - *** *** 322,328 { IndexElem *n = makeNode(IndexElem); n-name = $1; ! n-expr = NULL; n-opclass = list_make1(makeString($2)); n-ordering = SORTBY_DEFAULT; n-nulls_ordering = SORTBY_NULLS_DEFAULT; --- 322,329 { IndexElem *n = makeNode(IndexElem); n-name = $1; ! n-iexpr = NULL; ! n-iexprname = NULL; n-opclass = list_make1(makeString($2)); n-ordering = SORTBY_DEFAULT; n-nulls_ordering = SORTBY_NULLS_DEFAULT; Index: src/backend/catalog/index.c === RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v retrieving revision 1.326 diff -c -r1.326 index.c *** src/backend/catalog/index.c 9 Dec 2009 21:57:50 - 1.326 --- src/backend/catalog/index.c 21 Dec 2009 02:47:04 - *** *** 217,224 indexpr_item = lnext(indexpr_item); /* ! * Make the attribute's name pg_expresssion_nnn (maybe think of ! * something better later) */ sprintf(NameStr(to-attname), pg_expression_%d, i + 1); --- 217,226 indexpr_item = lnext(indexpr_item); /* ! * Make the attribute's name pg_expression_nnn (maybe think of ! * something better later? we're kind of locked in now, though, ! * because pg_dump exposes these names when dumping comments on ! * index columns) */ sprintf(NameStr(to-attname), pg_expression_%d, i + 1); Index: src/backend/commands/indexcmds.c
Re: [HACKERS] Possible patch for better index name choosing
On Sun, Dec 20, 2009 at 10:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: Attached is a WIP patch for addressing the problems mentioned in this thread: http://archives.postgresql.org/pgsql-hackers/2009-12/msg01764.php The main things that it does are (1) consider all index columns, not just the first one as formerly; and (2) try to generate a usable name for index expression columns, rather than just ignoring them which was the effective behavior formerly. I'm not really sure there's any point to this. Anyone who cares about giving their index an intelligible name should manually assign one. If they don't bother doing that, I don't really see why we should worry about it either. If anything, it seems like we should err on the side of simplicity, since some users (or even applications) might attempt to identify or predict automatically generated names. A different consideration is whether it's really a good idea to be messing with default index names at all. As illustrated in the attached regression diffs, this does impact the error messages returned to applications for unique-index failures. I don't think this is a serious problem across a major version update, but maybe someone thinks differently. Maybe I'll reserve final judgement pending further discussion, but my first reaction is to say it's not worth the risk. Probably this shouldn't be an issue for a well-designed application, but the world is full of badly-written code. We shouldn't throw up barriers (even relatively trivial ones) to updating applications unless we get something out of it, and I'm not convinced that's the case here. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible patch for better index name choosing
Robert Haas robertmh...@gmail.com writes: On Sun, Dec 20, 2009 at 10:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: Attached is a WIP patch for addressing the problems mentioned in this thread: http://archives.postgresql.org/pgsql-hackers/2009-12/msg01764.php I'm not really sure there's any point to this. Anyone who cares about giving their index an intelligible name should manually assign one. If they don't bother doing that, I don't really see why we should worry about it either. Mainly because we historically *have* put some work into it, and it would be inconsistent to not pay attention to the point now as we extend the set of possible index-building constraints further. In particular we're going to see a lot of exclusion constraints named foo_exclusionN if we don't expend any effort on it now. I also claim that this is necessary infrastructure if we are going to accept Peter's proposal of allowing CREATE INDEX without an explicit index name. That is really dependent on the assumption that the system will expend more than no effort on picking useful names. Maybe I'll reserve final judgement pending further discussion, but my first reaction is to say it's not worth the risk. Probably this shouldn't be an issue for a well-designed application, but the world is full of badly-written code. We shouldn't throw up barriers (even relatively trivial ones) to updating applications unless we get something out of it, and I'm not convinced that's the case here. Well, we could tamp down the risks considerably if we undid my point (1), namely to still consider only the first index column when generating a name. I am not really happy with that answer though. I could turn your first point back on you: if an app is concerned about the exact names assigned to indexes, why isn't it specifying them? It's worth noting that pg_dump does preserve index names, so this isn't going to be an issue in any case for existing apps that dump and reload their databases. AFAICS the only case where it would actually create a compatibility issue is if an existing app creates multi-column UNIQUE (non-PKEY) constraints on-the-fly, without a constraint name, and depends on the generated name being the same as before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible patch for better index name choosing
On Mon, Dec 21, 2009 at 12:03 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Dec 20, 2009 at 10:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: Attached is a WIP patch for addressing the problems mentioned in this thread: http://archives.postgresql.org/pgsql-hackers/2009-12/msg01764.php I'm not really sure there's any point to this. Anyone who cares about giving their index an intelligible name should manually assign one. If they don't bother doing that, I don't really see why we should worry about it either. Mainly because we historically *have* put some work into it, and it would be inconsistent to not pay attention to the point now as we extend the set of possible index-building constraints further. In particular we're going to see a lot of exclusion constraints named foo_exclusionN if we don't expend any effort on it now. Maybe that's worth fixing and maybe it isn't. My first reaction is so what? In all likelihood, you're going to have to look at the index definition to see what the thing does anyway. I also claim that this is necessary infrastructure if we are going to accept Peter's proposal of allowing CREATE INDEX without an explicit index name. That is really dependent on the assumption that the system will expend more than no effort on picking useful names. That's a point to consider, though perhaps if they aren't specifying a name it means they don't care that much. Maybe I'll reserve final judgement pending further discussion, but my first reaction is to say it's not worth the risk. Probably this shouldn't be an issue for a well-designed application, but the world is full of badly-written code. We shouldn't throw up barriers (even relatively trivial ones) to updating applications unless we get something out of it, and I'm not convinced that's the case here. Well, we could tamp down the risks considerably if we undid my point (1), namely to still consider only the first index column when generating a name. I am not really happy with that answer though. I could turn your first point back on you: if an app is concerned about the exact names assigned to indexes, why isn't it specifying them? It's worth noting that pg_dump does preserve index names, so this isn't going to be an issue in any case for existing apps that dump and reload their databases. AFAICS the only case where it would actually create a compatibility issue is if an existing app creates multi-column UNIQUE (non-PKEY) constraints on-the-fly, without a constraint name, and depends on the generated name being the same as before. Right. Imagine, for example, a poorly written initialization script for an app. Existing instances that are dumped and reloaded will be OK, but new instances might not come out as expected. I don't think that what you're proposing here is completely stupid; I'm just wondering if it's not an ultimately somewhat pointless activity. I'm not convinced that it's possible or sensible to try to stringify all the things that people put in their index definitions, or that we're going to be able to do it well enough to really add any value.Perhaps I should RTFP before sticking my neck out too far, but... will you serialize EXCLUDE (a =), EXCLUDE (a ), and EXCLUDE (a some other operator) differently? And if so, do you expect the user to be able to reconstruct what the constraint is doing by looking at the serialized version? It seems like something reasonably sane can be done when the definition uses mostly column names and functions, but operators seem like more of a problem. I think mostly people are going to see the constraint name that got violated and then run \d on the table and look for it. foo_exclusion3 may not be very informative, but it's easy to remember for long enough to find it in the \d output, whereas something long and hairy may not be. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible patch for better index name choosing
Robert Haas robertmh...@gmail.com writes: Perhaps I should RTFP before sticking my neck out too far, but... will you serialize EXCLUDE (a =), EXCLUDE (a ), and EXCLUDE (a some other operator) differently? No, and I'm not proposing to expose ASC/DESC/NULLS FIRST/LAST or nondefault opclasses (to say nothing of non-btree AMs) or index predicates either. The proposed patch is to my mind just a logical extension of what we have always done --- namely, to pay attention to index column names --- to some new cases that were never exposed before. We could certainly make it pay attention to all that stuff, but I have the same feeling you do that it wouldn't produce readable results. And it would make any compatibility issues a lot worse. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers