Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
=?UTF-8?B?SmFuIEt1bmRyw6F0?= j...@flaska.net writes: Attached is a second version of this patch which keeps the size of the output at 64 characters per column (which is an arbitrary value defined as a const int, which I hope matches your style). Longer values have their last three characters replaced by ..., so there's no way to distinguish them from a legitimate string that ends with just that. There's also no escaping of special-string values, similar to how the BuildIndexValueDescription operates. Applied with some revisions; notably, that I pulled the code out into a separate subroutine so that it could be used for more than one thing. I was wondering in particular whether it wouldn't be appropriate to include the same errdetail in ExecConstraints's other check, the one for null in not-null columns. Arguably a not-null check is just a slightly optimized form of a CHECK constraint, and anyway if you think you need row identification info for a CHECK failure I don't see why you'd not want it for NOT NULL failure. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Tom Lane t...@sss.pgh.pa.us wrote: I was wondering in particular whether it wouldn't be appropriate to include the same errdetail in ExecConstraints's other check, the one for null in not-null columns. Arguably a not-null check is just a slightly optimized form of a CHECK constraint, and anyway if you think you need row identification info for a CHECK failure I don't see why you'd not want it for NOT NULL failure. Comments? Makes sense to me. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
On 11/10/11 00:48, José Arthur Benetasso Villanova wrote: First, I couldn't apply it as in the email, even in REL9_0_STABLE: the offset doesn't look right. Which commit are your repository in? Hi Jose, thanks for looking at the patch. It's based on b07b2bdc570cfbb39564c8a70783dbce1edcb3d6, which was REL9_0_STABLE at the time I made the change. Cheers, Jan smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Hi José and Robert, thanks for your time and a review. Comments below. On 11/10/11 03:47, Robert Haas wrote: It does this already, without this patch. This patch is about CHECK constraints, not UNIQUE ones. That's right. This is how to check what the patch changes: jkt= CREATE TABLE tbl (name TEXT PRIMARY KEY, a INTEGER CHECK (a0)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index tbl_pkey for table tbl CREATE TABLE jkt= INSERT INTO tbl (name, a) VALUES ('x', 10); INSERT 0 1 jkt= UPDATE tbl SET a = -a; ERROR: new row for relation tbl violates check constraint tbl_a_check DETAIL: New row with data (x, -10) violates check constraint tbl_a_check. The last line, the detailed error message, is added by the patch. I believe we've previously rejected patches along these lines on the grounds that the output could realistically be extremely long. Imagine that you have a table with an integer primary key column and a text column. The integer column has a check constraint on it. But the text column might contain a kilobyte, or a megabyte, or even a gigabyte worth of text, and we don't necessarily want to spit that all out on an error. For unique constraints, we only emit the values that are part of the constraint, which in most cases will be relatively short (if they're more than 8kB, they won't fit into an index block); but here the patch wants to dump the whole tuple, and that could be really big. That's an interesting thought. I suppose the same thing is an issue with unique keys, but they tend to not be created over huge columns, so it isn't really a problem, right? Would you object to a patch which outputs just the first 8kB of each column? Having at least some form of context is very useful in my case. (And as a side note, I'm not really familiar with Postgres' internals, so it took me roughly six hours to arrive to a working patch from the very start. I'd therefore welcome pointers about the best way to achieve that with Postgres' string stream interface.) With kind regards, Jan -- Trojita, a fast e-mail client -- http://trojita.flaska.net/ signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Hello (And as a side note, I'm not really familiar with Postgres' internals, so it took me roughly six hours to arrive to a working patch from the very start. I'd therefore welcome pointers about the best way to achieve that with Postgres' string stream interface.) http://www.pgsql.cz/index.php/C_a_PostgreSQL_-_intern%C3%AD_mechanismy Regards Pavel With kind regards, Jan -- Trojita, a fast e-mail client -- http://trojita.flaska.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Hi José and Robert, thanks for your time and a review. Comments below. On 11/10/11 03:47, Robert Haas wrote: It does this already, without this patch. This patch is about CHECK constraints, not UNIQUE ones. That's right. This is how to check what the patch changes: jkt= CREATE TABLE tbl (name TEXT PRIMARY KEY, a INTEGER CHECK (a0)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index tbl_pkey for table tbl CREATE TABLE jkt= INSERT INTO tbl (name, a) VALUES ('x', 10); INSERT 0 1 jkt= UPDATE tbl SET a = -a; ERROR: new row for relation tbl violates check constraint tbl_a_check DETAIL: New row with data (x, -10) violates check constraint tbl_a_check. The last line, the detailed error message, is added by the patch. I believe we've previously rejected patches along these lines on the grounds that the output could realistically be extremely long. Imagine that you have a table with an integer primary key column and a text column. The integer column has a check constraint on it. But the text column might contain a kilobyte, or a megabyte, or even a gigabyte worth of text, and we don't necessarily want to spit that all out on an error. For unique constraints, we only emit the values that are part of the constraint, which in most cases will be relatively short (if they're more than 8kB, they won't fit into an index block); but here the patch wants to dump the whole tuple, and that could be really big. That's an interesting thought. I suppose the same thing is an issue with unique keys, but they tend to not be created over huge columns, so it isn't really a problem, right? Would you object to a patch which outputs just the first 8kB of each column? Having at least some form of context is very useful in my case. (And as a side note, I'm not really familiar with Postgres' internals, so it took me roughly six hours to arrive to a working patch from the very start. I'd therefore welcome pointers about the best way to achieve that with Postgres' string stream interface.) With kind regards, Jan -- Trojita, a fast e-mail client -- http://trojita.flaska.net/ signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
2011/11/10 Jan Kundrát j...@flaska.net: On 11/10/11 03:47, Robert Haas wrote: It does this already, without this patch. This patch is about CHECK constraints, not UNIQUE ones. That's right. This is how to check what the patch changes: jkt= CREATE TABLE tbl (name TEXT PRIMARY KEY, a INTEGER CHECK (a0)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index tbl_pkey for table tbl CREATE TABLE jkt= INSERT INTO tbl (name, a) VALUES ('x', 10); INSERT 0 1 jkt= UPDATE tbl SET a = -a; ERROR: new row for relation tbl violates check constraint tbl_a_check DETAIL: New row with data (x, -10) violates check constraint tbl_a_check. The last line, the detailed error message, is added by the patch. The patch uses 'New row with data ' but it was an UPDATE, if you go further with this patch, IMO the message should be fixed too. -- Dickson S. Guedes mail/xmpp: gue...@guedesoft.net - skype: guediz http://guedesoft.net - http://www.postgresql.org.br -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
On 11/10/11 12:41, Dickson S. Guedes wrote: jkt= UPDATE tbl SET a = -a; ERROR: new row for relation tbl violates check constraint tbl_a_check DETAIL: New row with data (x, -10) violates check constraint tbl_a_check. The last line, the detailed error message, is added by the patch. The patch uses 'New row with data ' but it was an UPDATE, if you go further with this patch, IMO the message should be fixed too. I'm not sure whether the code can determine whether the check gets triggered by an UPDATE or an INSERT (both commands lead to this code path). Please note that the already-existing error message (the ERROR: line in the output I enclosed) already uses the phrase new row. That said, I'll of course be more than happy to include whatever string which fits better, and am open to any suggestions. Cheers, Jan -- Trojita, a fast e-mail client -- http://trojita.flaska.net/ signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
On Thu, Nov 10, 2011 at 5:40 AM, Jan Kundrát j...@flaska.net wrote: That's an interesting thought. I suppose the same thing is an issue with unique keys, but they tend to not be created over huge columns, so it isn't really a problem, right? Pretty much. Would you object to a patch which outputs just the first 8kB of each column? Having at least some form of context is very useful in my case. Well, if we're going to try to emit some context here, I'd suggest that we try to output only the columns implicated in the CHECK constraint, rather than the whole tuple. I'm not sure whether emitting only a certain amount of output (either total, or for each column) can be made to work nicely, or whether the feature overall is something we want. It seems like a trade-off between possibly useful context and possibly annoying log clutter, and I guess I don't have a strong opinion on which way to go with it. Anyone else have an opinion? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
On 11/10/11 13:04, Robert Haas wrote: Well, if we're going to try to emit some context here, I'd suggest that we try to output only the columns implicated in the CHECK constraint, rather than the whole tuple. I'm not sure whether emitting only a certain amount of output (either total, or for each column) can be made to work nicely, or whether the feature overall is something we want. It seems like a trade-off between possibly useful context and possibly annoying log clutter, and I guess I don't have a strong opinion on which way to go with it. OK, let me start with some background on why I actually want to have such a feature. The project which we're working on [1] (and [2] for some context about why the hell we bother) allows users to define layout of their DB tables using standard CREATE TABLE ... stanzas, including various triggers, check constraints etc etc. What our project does is generating plenty of stored procedures which essentially built a version-control infrastructure around the user-specified table layout. Our workflow utilizes something similar to the concept of a working copy in Subversion. It means that any modifications that users perform are executed on an extra table (the history one) which does not enforce any user-specified constraints. It's only at the time of a commit, where data is moved by `UPDATE tabl SELECT ... FROM tbl_history where revision = $pending_changeset` to its final destination and all the checks, triggers and constraints are enforced. The issue which we've hit is that when the user has specified a CHECK constraint and tries to save many rows at once, we don't have any information about what went wrong besides the name of the check which failed. It's better than nothing, but given that Pg provides very similar information for UNIQUE columns, it looked like a good feature to implement. What I want to find in the end is something which tells me this row causes the error. Unfortunately, as the new row of the table with the constraint is not yet on disk, it doesn't really have its own ctid, and therefore I cannot report that. (Which makes sense, obviously.) I also realize that our use case is a bit esoteric and very far from the mainstream Postgres applications, but I believe that simply having detailed error messages is a good thing overall. Of course it's clearly possible that we're doing it completely wrong, so if someone has a suggestion or would like to chat about that, I'm all ears (feel free to go off-list here). Now I realize that there might be some concerns about error log cluttering etc. On the other hand, I'd take it for granted that it's a good idea to include at least *some* context in the error messages (and I assume that's what the detail field is for). If it's acceptable for UNIQUE constraints to show the index values (which are enough to identify the troublesome row), it seems to me that extending this to CHECKs is a natural further development and leads to better consistency. As I've said earlier, I'm not at all familiar with Postgres' internals, so before I go ahead and spend another night finding out how to look at the table/check metadata and print just the columns which are referenced by a CHECK, if that's even possible, I'd like to know whether such a patch would be welcome and accepted or not :). Again, a big thank you for your review -- it's much appreciated. Cheers, Jan [1] https://projects.flaska.net/projects/deska [2] https://projects.flaska.net/attachments/download/74/2011-11-10-deska-18e4c5b.pdf -- Trojita, a fast e-mail client -- http://trojita.flaska.net/ signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
What I want to find in the end is something which tells me this row causes the error. Unfortunately, as the new row of the table with the constraint is not yet on disk, it doesn't really have its own ctid, and therefore I cannot report that. (Which makes sense, obviously.) Would an error with the row's PK value be useful? Something like row with primary key 'pk_val' fails check 'foo_check'. That would be limited in size, yet give some context. There are two problems I can see: - The PK value doesn't necessarily identify the row in any useful manner (SERIAL primary key in INSERT). - The table might lack PK constraint (skip the detail in this case?) - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Robert Haas robertmh...@gmail.com writes: On Thu, Nov 10, 2011 at 5:40 AM, Jan Kundrát j...@flaska.net wrote: Would you object to a patch which outputs just the first 8kB of each column? Having at least some form of context is very useful in my case. Well, if we're going to try to emit some context here, I'd suggest that we try to output only the columns implicated in the CHECK constraint, rather than the whole tuple. I think that's likely to be impractical, or at least much more trouble than the feature is worth. Also, if you might emit only a subset of columns, then you have to label them, a la the FK error messages: Key (x,y,z) = (this,that,theother) That's going to make the line length problem worse not better. I concur with just length-limiting the dumped values, and in fact would prefer a limit much more draconian than 8K. Don't we limit the key lengths to 1K or so in FK and unique-key messages? If the goal is to identify the problematic line, I would think that a few dozen bytes per column would be plenty. I'm not sure whether emitting only a certain amount of output (either total, or for each column) can be made to work nicely, or whether the feature overall is something we want. It seems like a trade-off between possibly useful context and possibly annoying log clutter, and I guess I don't have a strong opinion on which way to go with it. I agree with Jan that this is probably useful; I'm pretty sure there have been requests for it before. We just have to make sure that the length of the message stays in bounds. One tip for keeping the length down: there is no value in repeating information from the primary error message, such as the name of the constraint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
On Thu, Nov 10, 2011 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Nov 10, 2011 at 5:40 AM, Jan Kundrát j...@flaska.net wrote: Would you object to a patch which outputs just the first 8kB of each column? Having at least some form of context is very useful in my case. Well, if we're going to try to emit some context here, I'd suggest that we try to output only the columns implicated in the CHECK constraint, rather than the whole tuple. I think that's likely to be impractical, or at least much more trouble than the feature is worth. Also, if you might emit only a subset of columns, then you have to label them, a la the FK error messages: Key (x,y,z) = (this,that,theother) That's going to make the line length problem worse not better. Depends. A lot of CHECK constraints may only reference one column: CHECK (a 0). The whole record is likely to be a lot longer than (a)=(-32768), and frankly tuples without column names aren't that readable anyway. I'd argue that to some degree, CHECK constraints, like UNIQUE constraints, probably tend to be placed primarily on relatively short columns. Now, UNIQUE constraints have a hard limitation, because a too-large value won't fit into an index block. And certainly you could do CHECK (document_is_valid_json(mumbleblump)). But many things that contain large amounts of text will just be free text fields, they won't be part of any constraint, and including them will just make things unreadable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Robert Haas robertmh...@gmail.com writes: On Thu, Nov 10, 2011 at 10:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, if we're going to try to emit some context here, I'd suggest that we try to output only the columns implicated in the CHECK constraint, rather than the whole tuple. I think that's likely to be impractical, or at least much more trouble than the feature is worth. Also, if you might emit only a subset of columns, then you have to label them, a la the FK error messages: Key (x,y,z) = (this,that,theother) That's going to make the line length problem worse not better. Depends. A lot of CHECK constraints may only reference one column: CHECK (a 0). The whole record is likely to be a lot longer than (a)=(-32768), and frankly tuples without column names aren't that readable anyway. Well, the other concern here is: how much context does it take to identify the problematic row? It's entirely likely that showing only the value of a isn't enough to solve the user's problem anyhow. So I think the argument for showing a subset of columns is quite weak. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
On 11/10/11 16:05, Tom Lane wrote: I agree with Jan that this is probably useful; I'm pretty sure there have been requests for it before. We just have to make sure that the length of the message stays in bounds. One tip for keeping the length down: there is no value in repeating information from the primary error message, such as the name of the constraint. Thanks to your comments and suggestions, I appreciate the time of the reviewers. Attached is a second version of this patch which keeps the size of the output at 64 characters per column (which is an arbitrary value defined as a const int, which I hope matches your style). Longer values have their last three characters replaced by ..., so there's no way to distinguish them from a legitimate string that ends with just that. There's also no escaping of special-string values, similar to how the BuildIndexValueDescription operates. Cheers, Jan -- Trojita, a fast e-mail client -- http://trojita.flaska.net/ diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 504f4de..9c2b285 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -1364,10 +1364,42 @@ ExecConstraints(ResultRelInfo *resultRelInfo, const char *failed; if ((failed = ExecRelCheck(resultRelInfo, slot, estate)) != NULL) + { + StringInfoData buf; + int natts = rel-rd_att-natts; + int i; + initStringInfo(buf); + for (i = 0; i natts; ++i) + { + char *val; + Oid foutoid; + bool typisvarlena; + size_t fieldlen; + const int cutofflen = 64; + getTypeOutputInfo(rel-rd_att-attrs[i]-atttypid, foutoid, typisvarlena); + if (slot-tts_isnull[i]) + val = NULL; + else + val = OidOutputFunctionCall(foutoid, slot-tts_values[i]); + if (i 0) + appendStringInfoString(buf, , ); + fieldlen = strlen(val); + if (fieldlen cutofflen) + { + appendBinaryStringInfo(buf, val, cutofflen - 3); + appendStringInfoString(buf, ...); + } + else + { + appendStringInfoString(buf, val); + } + } ereport(ERROR, (errcode(ERRCODE_CHECK_VIOLATION), errmsg(new row for relation \%s\ violates check constraint \%s\, - RelationGetRelationName(rel), failed))); + RelationGetRelationName(rel), failed), +errdetail(Failing row: (%s)., buf.data))); + } } } signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
2011/11/9 José Arthur Benetasso Villanova jose.art...@gmail.com: postgres=# create table test1(id serial primary key, value text); NOTICE: CREATE TABLE will create implicit sequence test1_id_seq for serial column test1.id NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index test1_pkey for table test1 CREATE TABLE postgres=# ALTER TABLE test1 ADD CONSTRAINT must_be_unique unique (value); NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index must_be_unique for table test1 ALTER TABLE postgres=# insert into test1 values (default, 'Hello World'); INSERT 0 1 postgres=# insert into test1 values (default, 'Hello World'); ERROR: duplicate key value violates unique constraint must_be_unique DETAIL: Key (value)=(Hello World) already exists. It does this already, without this patch. This patch is about CHECK constraints, not UNIQUE ones. I believe we've previously rejected patches along these lines on the grounds that the output could realistically be extremely long. Imagine that you have a table with an integer primary key column and a text column. The integer column has a check constraint on it. But the text column might contain a kilobyte, or a megabyte, or even a gigabyte worth of text, and we don't necessarily want to spit that all out on an error. For unique constraints, we only emit the values that are part of the constraint, which in most cases will be relatively short (if they're more than 8kB, they won't fit into an index block); but here the patch wants to dump the whole tuple, and that could be really big. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers