[HACKERS] Documentation broken due to typo
The following commit to release-8.5.sgml: http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=e043393bcc7ed484c3e3cd6a1ce4c63651ab46be broke the documentation (no end tag). Wouldn't the buildfarm notice this, or does it not build the docs? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Thu, 2009-11-26 at 00:35 -0500, Andrew Dunstan wrote: > On Wed, November 25, 2009 3:56 pm, Jeff Davis wrote: > > > > I worry that we're getting further away from the original problem. Let's > > allow functions to get the bytes of data from a COPY, like the original > > proposal. I am not sure COPY is the best mechanism to move records > > around when INSERT ... SELECT already does that. > > > > > I am not at all sure I think that's a good idea, though. We have > pg_read_file() for getting raw bytes from files. Building that into COPY > does not strike me as a good fit. I think we're in agreement. All I mean is that the second argument to COPY should produce/consume bytes and not records. I'm not discussing the internal implementation at all, only semantics. In other words, STDIN is not a source of records, it's a source of bytes; and likewise for STDOUT. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Thu, 2009-11-26 at 05:01 +0100, Pavel Stehule wrote: > It working like: > > 1. EXECUTE SELECT 0 FROM generate_series(1,...); > 2. STORE RESULT TO TABLE zero; > 3. EXECUTE SELECT 1/i FROM zero; > 4. STORE RESULT TO TABLE tmp; > > Problem is in seq execution. Result is stored to destination after > execution - so any materialisation is necessary, > My example showed that steps 3 and 4 are not executed sequentially, but are executed together. If 3 was executed entirely before 4, then the statement: insert into tmp select 1/i from zero; would have to read the whole table "zero" before an error is encountered. However, the statement errors immediately, showing that steps 3 and 4 are pipelined. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Wed, 2009-11-25 at 15:59 -0800, Jeff Davis wrote: > > My operator-class-fu is insufficient to render judgment on this point. > > I think the thing to do is look at a bunch of non-built-in opclasses > > and check for POLA violations. > > Ok, I'll consider this more. In cases where the operator class type is different from the search operator's operands' types, one of the following is usually true: * There is a binary cast from the opclass type (same as expression type) to the search operator's operands' types, or it is otherwise compatible (e.g. ANYARRAY). * There is a candidate function that's a better match (e.g. there may be an =(int8, int8) on int2_ops, but there's also =(int2, int2)). * The left and right operand types are different, and therefore do not work with operator exclusion constraints anyway (e.g. full text search @@). After installing all contrib modules, plus my period module, and postgis, there appear to be no instances that violate these assumptions (according to a join query and some manual testing). In theory there could be, however. It's kind of ugly to make it work, and a challenge to test it, so for now I'll only accept operators returned by compatible_oper(). If you disagree, I can make it work. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Wed, 2009-11-25 at 09:02 -0500, Robert Haas wrote: > I disagree wholeheartedly. :-) My ideal error message is something like: > > DETAIL: (a, b, c)=(1, 2, 3) conflicts with (a, b, c)=(4, 5, 6) > > In particular, I think it's very important that we only emit the > columns which are part of the operator exclusion constraints, and NOT > all the columns of the tuple. The entire tuple could be very large - > one of the columns not involved in the constraint could be a 4K block > of text, for example, and spitting that out only obscures the real > source of the problem. You could argue that one of the columns > involved in the constraint could be a 4K block text, too, but in > practice I think the constraint columns are likely to be short nine > times out of ten. The equals sign is equating the column names to the > column values, not the values to each other, so I don't see that as > confusing. Ok, fair enough. But how do you feel about: (a: 1, b: 2, c: 3) as a tuple representation instead? I think it's closer to the way people expect a tuple to be represented. I can change it in one place so that the unique violations are reported that way, as well (as long as there are no objections). It still doesn't contain the operators, but they can look at the constraint definition for that. > My operator-class-fu is insufficient to render judgment on this point. > I think the thing to do is look at a bunch of non-built-in opclasses > and check for POLA violations. Ok, I'll consider this more. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Wed, 2009-11-25 at 11:32 +0100, Pavel Stehule wrote: > 1. > postgres=# select count(*) from generate_series(1,100); > count > ─ > 100 > (1 row) > > Time: 930,720 ms > > 2. > postgres=# select count(*) from (select generate_series(1,100)) x; > count > ─ > 100 > (1 row) > > Time: 276,511 ms > > 2. is significantly faster then 1 (there are not SRF materialisation) I think case #1 can be fixed. > generate_function is fast and simple - but still COPY is about 30% faster My quick tests are not consistent enough, so I will have to try with more data. The times look similar to me so far. If there is a difference, I wonder what it is? > I thing, so materialisation is every time, when you use any SQL > statement without cursor. I don't think that is true. Here's an expanded version of my previous example: create table zero(i int); create table tmp(j int); insert into zero select 0 from generate_series(1,100); -- all 0 insert into tmp select 1/i from zero; -- error immediately, doesn't wait The error would take longer if it materialized the table "zero". But instead, it passes the first tuple to the function for "/" before the other tuples are read, and gets an error immediately. So no materialization. I worry that we're getting further away from the original problem. Let's allow functions to get the bytes of data from a COPY, like the original proposal. I am not sure COPY is the best mechanism to move records around when INSERT ... SELECT already does that. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Wed, 2009-11-25 at 09:23 +0100, Pavel Stehule wrote: > > If SRFs use a tuplestore in that situation, it sounds like that should > > be fixed. Why do we need to provide alternate syntax involving COPY? > > It isn't problem of SRF function design. It allow both mode - row and > tuplestor. select * from generate_series(1,10) limit 1; That statement takes a long time, which indicates to me that it's materializing the result of the SRF. And there's no insert there. > This is problem of INSERT statement, resp. INSERT INTO > SELECT implementation. If "tmp" is a new table, and "zero" is a table with a million zeros in it, then: insert into tmp select 1/i from zero; fails instantly. That tells me that it's not materializing the result of the select; rather, it's feeding the rows in one at a time. Can show me in more detail what you mean? I'm having difficulty understanding your short replies. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Wed, 2009-11-25 at 07:36 +0100, Pavel Stehule wrote: > > Moving records from a function to a table can be done with: > > INSERT INTO mytable SELECT * FROM myfunc(); > > And that already works fine. > > It works, but COPY FROM myfunc() should be significantly faster. You > can skip tuple store. If SRFs use a tuplestore in that situation, it sounds like that should be fixed. Why do we need to provide alternate syntax involving COPY? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Wed, 2009-11-25 at 07:31 +0100, Pavel Stehule wrote: > > My disagreement with the row-by-row approach is more semantics than > > performance. COPY translates records to bytes and vice-versa, and your > > original patch maintains those semantics. > > uff, really > > COPY CSV ? CSV is like text or binary: just another format to _represent_ records as a sequence of bytes. A CSV file is not a set of postgresql records until COPY interprets it as such. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Wed, 2009-11-25 at 06:35 +0100, Pavel Stehule wrote: > I believe so using an "internal" minimalize necessary changes in COPY > implementation. Using a funcapi needs more work inside COPY - you > have to take some functionality from COPY to stream functions. > Probably the most slow operations is parsing - calling a input > functions. This is called once every where. Second slow operation is > reading from network - it is same. So I don't see too much reasons, > why non internal implementation have to be significant slower than > your actual implementation. I am sure, so it needs more work. I apologize, but I don't understand what you're saying. Can you please restate with some examples? It seems like you're advocating that we move records from a table into a function using COPY. But that's not what COPY normally does: COPY normally translates records to bytes or bytes to records. Moving records from a table to a function can be done with: SELECT myfunc(mytable) FROM mytable; already. The only problem is if you want initialization/destruction. But I'm not convinced that COPY is the best tool to provide that. Moving records from a function to a table can be done with: INSERT INTO mytable SELECT * FROM myfunc(); And that already works fine. So what use case are you concerned about? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Tue, 2009-11-24 at 21:42 -0800, Daniel Farina wrote: > You are probably right. We could try coercing to bytea and back out > to bytes, although it seems like a superfluous cost to force > *everyone* to pay just to get the same bytes to a network buffer. Well, I suppose only performance will tell. Copying a buffer is sure to be faster than invoking all of the type input/output functions, or even send/recv, so perhaps it's not a huge penalty. My disagreement with the row-by-row approach is more semantics than performance. COPY translates records to bytes and vice-versa, and your original patch maintains those semantics. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Tue, 2009-11-24 at 23:44 -0500, Tom Lane wrote: > If you do that, then there is no possibility of ever using this feature > except with C-coded functions, which seems to me to remove most of > whatever use-case there was. It fits the use case involving dblink (or dblink-like modules). Maybe the patch's performance should be tested with and without copying the buffer, to see if we're losing anything significant. If we can do almost as well copying the data and passing that as a bytea value to the function, then I agree that would be better. I still don't see any reason to force it to be record by record though. If the point is to push data from a table into a remote table, why should the copied data be translated out of binary format into a record, and then back into binary form to send to the remote system? Currently, the second argument to copy is a source or destination of bytes, not records. So forcing it to deal with records is inconsistent. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Tue, 2009-11-24 at 14:39 +0100, Pavel Stehule wrote: > a) good designed C API like: > >initialise_functions(fcinfo) -- std fcinfo >consument_process_tuple(fcinfo) -- gets standard row -- Datum > dvalues[] + Row description >producent_process_tuple(fcinfo) -- returns standard row -- Datum > dvalues[] + Row description (look on SRF API) >terminate_funnction(fcinfo) > Don't you still need the functions to accept an argument of type internal? Otherwise, we lose the ability to copy a buffer to the dblink connection, which was the original motivation. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
On Tue, 2009-11-24 at 00:54 -0800, Daniel Farina wrote: > On Tue, Nov 24, 2009 at 12:29 AM, Hannu Krosing wrote: > > COPY stdin TO udf(); > > If stdin becomes (is?) a legitimate source of records, then this patch > will Just Work. > STDIN is a source of bytes representing a set of records. Currently, the first argument to COPY is a source or destination of records; and the second argument is a source or destination of bytes representing a set of records. I think we want the first argument to remain a source or destination of real records with types; that is, a table or perhaps a function. And we want the second argument to remain a source or destination of bytes; that is, a file or perhaps a function (albeit not the same kind as the former function). > > COPY udf() FROM stdin; > > This is unaddressed, but I think it would be a good idea to consider > enabling this kind of thing prior to application. This makes much more sense, but it is a very different type of function from the original proposal (which basically accepts a buffer). I agree that it sounds useful and would be good for the sake of symmetry. One use case may be a degree of data cleaning. For instance, you could use a "looser" function definition, like udf(cstring, cstring, ...), where all COPY does is break up the records into fields, and the function can recover from type input failures using subtransactions. Binary mode could do a similar thing with bytea. However, I recommend that you don't try to generalize this as a data cleanup feature that can handle ragged input. That seems like a separate problem that will distract from the original use case. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
I'm in Tokyo right now, so please excuse my abbreviated reply. On Tue, 2009-11-17 at 23:13 -0500, Robert Haas wrote: > Forgive me if this is discussed before, but why does this store the > strategy numbers of the relevant operators instead of the operators > themselves? At constraint definition time, I need to make sure that the strategy numbers can be identified anyway, so it wouldn't save any work in ATAddOperatorExclusionConstraint. At the time it seemed slightly more direct to use strategy numbers in index_check_constraint, but it's probably about the same. > It seems like this could lead to surprising behavior if > the user modifies the definition of the operator class. Right now, operator classes can't be modified in any meaningful way. Am I missing something? > I'm wondering if we can't use the existing > BuildIndexValueDescription() rather than the new function > tuple_as_string(). I realize there are two tuples, but maybe it makes > sense to just call it twice? Are you suggesting I change the error output, or reorganize the code to try to reuse BuildIndexValueDescription, or both? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New VACUUM FULL
On Mon, 2009-11-16 at 13:37 +0900, Itagaki Takahiro wrote: > Jeff Davis wrote: > > > You left INPLACE in the patch > > Oops, removed. > > > Sounds fine, but worth a mention in the documentation. Just add to the > > "columns" part of the VACUUM page something like: "If specified, implies > > ANALYZE". > > Added. > Great, I am marking this part "ready for committer". I may be slow to review the remainder of the VACUUM FULL rewrite patch due to the conference in Tokyo, but I will review it soon afterward. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New VACUUM FULL
On Sat, 2009-11-14 at 19:25 -0500, Tom Lane wrote: > As a rule of thumb, I'd recommend keeping as much complexity as possible > *out* of gram.y. It's best if that code just reports the facts, ie what > options the user entered. Deriving conclusions from that (like what > default behaviors should be used) is best done later. One example of > why is that if you want the defaults to depend on GUC settings then > that logic must *not* happen in gram.y, since the parse tree might > outlive the current GUC settings. I was referring to (in vacuum()): + if (vacstmt->options & (VACOPT_INPLACE | VACOPT_REPLACE | VACOPT_FREEZE)) + vacstmt->options |= VACOPT_VACUUM; + if (vacstmt->va_cols) + vacstmt->options |= VACOPT_ANALYZE; I think what you say applies to VACOPT_ANALYZE, which is implied when columns are supplied by the user but ANALYZE is not specified explicitly. In that case it should be set in vacuum() but not in gram.y (unless specified by the user). However, for VACOPT_VACUUM, I think that's still in the territory of gram.y. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New VACUUM FULL
On Tue, 2009-10-27 at 13:55 +0900, Itagaki Takahiro wrote: > Here is a patch to support "rewrite" version of VACUUM FULL. Can you please provide a patch that applies cleanly on top of the vacuum options patch and on top of CVS HEAD (there are a couple minor conflicts with this version). That would make review easier. Initial comments: 1. Do we want to introduce syntax for INPLACE at all, if we are eventually going to remove the current mechanism? If not, then we should probably use REWRITE, because that's a better word than "REPLACE", I think. My opinion is that if we really still need the current in-place mechanism, then VACUUM (FULL) should use the current in-place mechanism; and VACUUM (FULL REWRITE) should use your new rewrite mechanism. That gives us a nice migration path toward always using the rewrite mechanism. 2. Why do all of the following exist: VACOPT_FULL, VACOPT_REPLACE, and VACOPT_INPLACE? Shouldn't VACOPT_FULL be equivalent to one of the other two? This is essentially what Simon was getting at, I think. 3. Some options are being set in vacuum() itself. It looks like the options should already be set in gram.y, so should that be an Assert instead? I think it's cleaner to set all of the options properly early on, rather than waiting until vacuum() to interpret the combinations. I haven't looked at the implementation in detail yet, but at a glance, it seems to be a good approach. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New VACUUM FULL
On Fri, 2009-11-13 at 10:47 +0900, Itagaki Takahiro wrote: > Alvaro Herrera wrote: > > > BTW I think the vacstmt.options change, which seems a reasonable idea to > > me, could be extracted from the patch and applied separately. That'd > > reduce the size of the patch somewhat. > > It's a good idea. I separated the part into the attached patch. > It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose" > fields into one "options" flag field. I added a separate commitfest item for this patch to track it separately from the rewrite version of VACUUM: https://commitfest.postgresql.org/action/patch_view?id=222 > The only user-visible change is to support "VACUUM (options)" syntax: > VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns) > We don't bother with the order of options in this form :) I looked at this patch. You left INPLACE in the patch, which looks like an oversight when you were separating the two. Please remove that from this part. > There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)" > in the abobe syntax. Columns are specified but no ANALYZE option there. > An ANALYZE option is added automatically in the current implementation, > but we should have thrown an syntax error in such case. Sounds fine, but worth a mention in the documentation. Just add to the "columns" part of the VACUUM page something like: "If specified, implies ANALYZE". Other than these two minor issues, I don't see any problems with the patch. Please post an updated version to the new commitfest entry. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New VACUUM FULL
On Fri, 2009-11-13 at 10:47 +0900, Itagaki Takahiro wrote: > Alvaro Herrera wrote: > > > BTW I think the vacstmt.options change, which seems a reasonable idea to > > me, could be extracted from the patch and applied separately. That'd > > reduce the size of the patch somewhat. > > It's a good idea. I separated the part into the attached patch. > It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose" > fields into one "options" flag field. > > The only user-visible change is to support "VACUUM (options)" syntax: > VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns) > We don't bother with the order of options in this form :) > > There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)" > in the abobe syntax. Columns are specified but no ANALYZE option there. > An ANALYZE option is added automatically in the current implementation, > but we should have thrown an syntax error in such case. I have just begun review by reading some of the previous threads. I'd like to try to summarize the goals we have for VACUUM to put these patches in perspective: 1. Implement a rewrite version of VACUUM FULL, which is closer to CLUSTER. 2. Use the new options syntax, to make the order of vacuum options irrelevant. 3. Implement several flatfile maps to allow the rewrite version of VACUUM FULL to work on catalogs: http://archives.postgresql.org/message-id/19750.1252094...@sss.pgh.pa.us 4. Implement some kind of concurrent tuple mover that can slowly update tuples and lazily VACUUM in such a way that they migrate to lower heap pages (assuming no long-running transactions). This would be done with normal updates (new xmin) and would not remove the old tuple until it was really dead; otherwise there are concurrency problems. Such a utility would be useful in cases where a very small fraction of tuples need to be moved, or you don't have enough space for a rewrite. 5. Remove VACUUM FULL (in it's current form) completely. Some other ideas also came out of the thread, like: * Provide a way to truncate the dead space from the end of a relation in a blocking manner. Right now, lazy vacuum won't shrink the file unless it can acquire an exclusive lock without waiting, and there's no way to actually make it wait. This can be a separate command, not necessarily a part of VACUUM. * Josh Berkus suggested allowing the user to specify a different tablespace in which to rebuild the tablespace. I'll begin looking at the patches themselves now, which implement #1 and #2. If we can implement enough of these features (say, #3 also) to remove the current form of VACUUM FULL, then we can just call the new feature VACUUM FULL, and save ourselves from syntactical headaches. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Mon, 2009-11-09 at 09:12 -0800, David E. Wheeler wrote: > On Nov 8, 2009, at 7:43 PM, Jeff Davis wrote: > > > Either of those names are fine with me, too. The current name is a > > somewhat shortened version of the name "operator-based exclusion > > constraints", so we can also just use that name. Or, just "exclusion > > constraints". > > (exclusion constraints)++ Ok, I guess this is another issue that requires consensus. Note: this is purely for documentation, release notes, and user-visible error messages. This does not have any impact on the syntax, I think we've got a strong consensus on that already and I would prefer not to break that discussion open again. 1. Operator Exclusion Constraints (current) 2. Generic/Generalized/General Exclusion Constraints 3. Exclusion Constraints (has the potential to cause confusion with constraint_exclusion) Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sat, 2009-11-14 at 14:36 -0500, Robert Haas wrote: > I guess my point wasn't that the message was likely to be exercised, > but rather that it isn't obvious that it's describing an error > condition at all. If you get this message: > > relation "whatever" has relopxconstraints = -3 > I pretty much copied similar code for relchecks, see pg_constraint.c:570. If you have a suggestion, I'll make the change. It doesn't sound too urgent though, to me. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sat, 2009-11-14 at 09:11 -0800, David E. Wheeler wrote: > On Nov 14, 2009, at 8:55 AM, Tom Lane wrote: > > I had been manfully restraining myself from re-opening this discussion, > > but yeah I was thinking the same thing. The original objection to using > > just WITH was that it wasn't very clear what you were doing "with" the > > operator; but that was back when we had a different initial keyword for > > the construct. EXCLUDE ... WITH ... seems to match up pretty naturally. > > You're more man than I, Tom, but yeah, with EXCLUDE, WITH works well on its > own, methinks. Changed in new patch here: http://archives.postgresql.org/message-id/1258226849.708.97.ca...@jdavis Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Fri, 2009-11-13 at 23:39 -0500, Robert Haas wrote: > [ reviewing ] Thank you for the comments so far. > In index_create(), the elog() where relopxconstraints < 0 should just > complain about the value being negative, I think, rather than listing > the value. If you just say the value is -3, it doesn't give the user > a clue why that's bad. Hopefully the user never sees that message -- it's almost an Assert. PostgreSQL uses elog(ERROR,...) in many places that should be unreachable, but might happen due to bugs in distant places or corruption. I'm not sure the exact convention there, but I figure that some details are appropriate. I tried to make all user-visible errors into ereport()s. > In ATAddOperatorExclusionConstraint(), the message "method %s does not > support gettuple" seems a bit user-unfriendly. Can we explain the > problem by referring to the functionality of getttuple(), rather than > the name of it? How about "operator exclusion constraints don't support method X"? Then perhaps have a detail-level message to explain that the access method needs to support the gettuple interface. Trying to describe what gettuple does doesn't help a humble user much. All they care about is "can't use gin". > I don't really like this message, for a number of reasons. > > alter table foo add constraint bar exclude (a check with =, b check with =); > ERROR: operator exclusion constraint violation detected: "foo_a_exclusion" > DETAIL: Tuple "(1, 1, 2)" conflicts with existing tuple "(1, 1, 3)". > > The corresponding error for a UNIQUE index is: could not create unique > index "bar", which I like better. Only the relevant columns from the > tuples are dumped, and the tuple is not surrounded by double quotes; > any reason not to parallel that here? By "relevant columns" I assume you mean the entire index tuple. That means we need to have column names represented somewhere, because we don't want the user to have to match up ordinal index columns. Also, with exclusion constraints, both values are always relevant, not just the one being inserted. What if the user just wants to adjust their request slightly to avoid an overlap -- how do they know how far to go? I know this could be accomplished with extra queries, as well, but that doesn't always work for someone looking through the logs after the fact, when the original values may be gone. So, the kind of error message you're suggesting starts to get awkward: (a: 1 = 1, b: 1 = 1) or something? And then with more complex type output functions, and expression indexes, it starts to look very complex. What do you think is the cleanest approach? > Also, the message is all > lower-case. I know the error conventions are documented somewhere, but I completely forgot where. Can you please point me to the right place? I thought most error messages were supposed to be lower case, and detail messages were supposed to read like sentences. > Similarly, for an insert/update situation, it seems that > a message like "key value violates exclusion constraint \"%s\"" would > be better than the existing message. I can certainly simplify it, but I was trying to match the usefulness of UNIQUE constraint error messages. > As a quick performance test, I inserted a million 3-integer tuples > into a 3-column table with a unique constraint or an operator > exclusion constraint over all three columns. The former took ~ 15 s, > the latter ~ 21.5 s. That seems acceptable. Great news. I had similar results, though they depend on the conflict percentage as well (I assume you had zero conflicts). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] next CommitFest
On Fri, 2009-11-13 at 23:10 +0100, Andres Freund wrote: > I have to admit that at least for me personally its way much easer to get > started on a patch one finds interesting than when not I find it much easier to get interested in a patch after I get started reviewing it ;) Seriously, that's happened with at least a couple patches that have been assigned to me. Either the code itself is interesting, or the feature is useful enough that I would want to learn about it anyway. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] next CommitFest
On Thu, 2009-11-12 at 14:43 -0500, Robert Haas wrote: > > Not all contributors are fluent English readers and writers. > > > > I certainly do not discourage those people from reviewing, but I can > > understand how it might be more frustrating and less productive for > > them. An important part of review is to read the relevant mailing list > > threads and try to tie up loose ends and find a consensus. > > Unfortunately, those people's patches also typically require more work > from other community members. Discussions are longer and more > confused, and someone has to rewrite the documentation and comments > before commit. If anything, people whose patches require more help > need to find ways to contribute MORE to the community, not less. I > understand that's not easy, but it's necessary. Keep in mind that many of these people may also be regional contacts, translators, local conference (or user group) organizers, and provide support to users in languages other than English. They may even organize development efforts in languages other than English. I think the best policies encourage people to help in the ways that they are most effective, and/or enjoy the most. I know that's a general statement, and it doesn't translate (excuse the pun) easily into policy. I don't have a good answer for that. But sometimes the policy discussions here seem a little abstract, and I have a hard time seeing how they apply in real situations. It seems like you have a few "freeloaders" in mind, but I have a hard time imagining it's more than a couple people. Maybe they just need a nice reminder to be a more helpful community member if they want timely feedback on their work? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] next CommitFest
On Thu, 2009-11-12 at 11:36 -0500, Robert Haas wrote: > I agree. I would quibble only with the details. I think we should > require patch authors to act as a reviewer for any CommitFest for > which they have submitted patches. We don't need every patch author > to review as many patches as they submit, because some people will > review extras, and some non-patch-authors will review. If they review > one patch each, that's probably more than enough. It's also easier > for bookkeeping purposes. Not all contributors are fluent English readers and writers. I certainly do not discourage those people from reviewing, but I can understand how it might be more frustrating and less productive for them. An important part of review is to read the relevant mailing list threads and try to tie up loose ends and find a consensus. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] next CommitFest
On Thu, 2009-11-12 at 15:52 -0200, Euler Taveira de Oliveira wrote: > Simon Riggs escreveu: > > So, I > > propose that we simply ignore patches from developers until they have > > done sufficient review to be allowed to develop again. > > Is it really impolite for a first-contributor, no? I believe the community has always given extra help to first-time contributors. If anyone sees a new contributor being ignored, they should make an effort to help. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Mon, 2009-11-09 at 18:03 +, Greg Stark wrote: > Out of curiosity, is this feature at all similar to SQL assertions? > What would we be missing to turn this into them? I addressed that here: http://archives.postgresql.org/pgsql-hackers/2009-11/msg00049.php The exclusion constraint mechanism can enforce a subset of the constraints that ASSERT can express; although the same goes for all other constraints, because ASSERT is very general. The exclusion constraint mechanism requires finding the physical tuples that cause a conflict, so that we know when to wait and on which transaction to wait. Otherwise, we have to wait on all transactions; i.e. serialize. The problem with ASSERT is that it expresses a constraint based on a query, which can return arbitrary logical records after an arbitrary amount of manipulation. So there's no way to work backwards. If we try, we'll end up either: (a) supporting only a tiny subset, and throwing bizarre errors that users don't understand when they try to work outside the template; or (b) deciding to serialize when we can't do better, and again, users will be confused about the performance and locking characteristics. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sun, 2009-11-08 at 22:03 +, Simon Riggs wrote: > Don't think that name is very useful either... sounds like you want to > exclude operators, which is why I got lost in the first place. I'd call > them "generic exclusion constraints" or "user-defined exclusion > constraints". Sorry for this. Either of those names are fine with me, too. The current name is a somewhat shortened version of the name "operator-based exclusion constraints", so we can also just use that name. Or, just "exclusion constraints". I'll leave the current patch as-is for now, and wait for some reviewer comments. This is purely a documentation issue, so there are bound to be a few of these things that I can clarify at once. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sat, 2009-11-07 at 14:11 -0500, Robert Haas wrote: > Honestly, I'd probably be in favor of breaking the virtual tie in > favor of whichever word is already a keyword The ones that are already keywords are EXCLUSIVE and EXCLUDING, which are also the least desirable, so that rule doesn't work as a tie-breaker. I think that EXCLUSION and EXCLUDE are the options still in the running here. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Fri, 2009-11-06 at 21:23 -0500, Tom Lane wrote: > Or maybe forget about it and go to EXCLUDE or EXCLUDING? I left it as EXCLUSION for now. "EXCLUDING USING ..." and "EXCLUSIVE USING ..." both sound a little awkward to me. Either could be improved by moving the USING clause around, but that just creates more grammar headaches. EXCLUDE probably flows most nicely with the optional USING clause or without. My only complaint was that it's a transitive verb, so it seems to impart more meaning than it actually can. I doubt anyone would actually be more confused in practice, though. If a couple of people agree, I'll change it to EXCLUDE. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Thu, 2009-11-05 at 11:16 -0800, David E. Wheeler wrote: > Well that's clearly a verb. So perhaps "EXCLUDE USING > gist" ("EXCLUDING USING gist" is a little weirder). That's not bad. As I just said in my other email, I think the word EXCLUDE is a little bit too specific, but the other ideas out there aren't perfect, either. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Thu, 2009-11-05 at 10:30 -0800, David E. Wheeler wrote: > But that doesn't read as well to my eye as: > > EXCLUDE (...) BY ... I think EXCLUDE might be a little *too* specific. It sounds like whatever is on the right hand side will be excluded, but that's not really what happens. EXCLUSION is vague about what is doing the excluding and what is being excluded. I think that's good in this case, because the actual meaning can't easily be expressed with a couple keywords, so suggesting the behavior is about as close as we can get (unless someone comes up with a new idea). > EXCLUDING (...) BY ... I think that's better, but still sounds a little wrong to me. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Thu, 2009-11-05 at 09:56 -0500, Tom Lane wrote: > Robert Haas writes: > > Ooh, that's kind of neat. But I think you'd need EXCLUSIVE (a, b) BY > > (=, =), since it could equally well be EXCLUSIVE (a, b) BY (=, &&). > > Yeah, we definitely want some parentheses delimiting the expression. > EXCLUSIVE still feels like the wrong part-of-speech though. How > about EXCLUDING (...) BY ... instead? I think EXCLUDING conflicts with the EXCLUDING in LIKE. Also, it becomes a little more difficult to place the access method clause, because "EXCLUDING USING gist" doesn't sound great. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - temporal contrib module
On Wed, 2009-11-04 at 12:08 -0500, Chris Browne wrote: > I'm > a lot less certain about the merits of PK/FK constraints - it is a lot > less obvious what forms of constraints will be able to be applied to > particular applications. Can you clarify, a little? A temporal key just means "non-overlapping periods of time", and that has a very clear meaning with respect to scheduling. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Tue, 2009-11-03 at 21:31 +, Dean Rasheed wrote: > Is this really a generalized uniqueness constraint, extended to > support operators other than = ? That has been discussed in the past: http://archives.postgresql.org/message-id/1253119552.24770.203.ca...@jdavis http://archives.postgresql.org/message-id/1253122946.24770.250.ca...@jdavis However, some constraints allowed by this feature are the *opposite* of unique: consider "<>". Personally, I don't like to use the word UNIQUE to describe a constraint that may reject unique values or permit duplicates. We already have some reasonable agreement around EXCLUSION ... CHECK WITH. We should stick with the current syntax unless there's a good consensus around some other specific proposal. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Mon, 2009-11-02 at 18:28 +, Simon Riggs wrote: > > I like the "NOT" here because "CHECK NOT =" seems to convey pretty > > clearly what it is you are checking for. Because NOT is reserved and > > can't appear as a connective, I think that this approach might allow > > a non-reserved leading word, thus possibly the second variant would > > work without reserving CONSTRAIN. I have not tested whether bison > > agrees with me though ;-). In any case I think "CHECK NOT =" reads > > pretty well, and don't feel a strong urge to use some other word there. > Peter, do any of these ideas work for you? It looks like this opens the door to using a word other than CHECK. CONSTRAIN NOT is a little awkward, is there another word that might work better? I'm not excited about using NOT, because I think it has a hint of a double-negative when combined with EXCLUSION. The original idea was to specify the way to find tuples mutually exclusive with the new tuple; and NOT makes that a little less clear, in my opinion. But I'm fine with it if that's what everyone else thinks is best. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Mon, 2009-11-02 at 08:25 +, Peter Eisentraut wrote: > I think the word CHECK should be avoided completely in this syntax, to > avoid confusion with CHECK constraints. This is an easy change. I don't have a strong opinion, so the only thing I can think to do is ask for a vote. Do you have a specific alternative in mind? How about just "WITH"? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Mon, 2009-11-02 at 07:38 +, Simon Riggs wrote: > It bothers me that we would have completely separate syntax for this > feature as opposed to normal SQL. It also doesn't make it easy to > interpret from the business statement to the implementation. Notice that > the "," above means "AND". Yes, in that way, it's similar to a UNIQUE constraint, e.g. UNIQUE (a, b). The more columns you add, the more permissive the constraint. > How would we use an OR conditional? Specify multiple constraints. > How would > we express the wish to use a partial index? EXCLUSION (...) WHERE (...) The perens are actually required around the predicate in this case, due to syntactic problems. > How would I express a bidding rule: "Only allow bids that are better > than the highest bid so far" > > EXCLUSION (item CHECK WITH =, bid_price CHECK WITH <) That would be a cool feature, unfortunately it won't work in the current form. This constraint is only enforced on index insert -- imagine what confusion would be caused when: UPDATE foo SET third_column = 7 ... If that's a HOT update, it wouldn't re-check the bid_price. If it turns into a cold update, it would reject the update because the bid_price is no longer the highest. > Did I get the ">" the right way around? The above problem essentially means we only allow commutative operators, which avoids this source of confusion. Interestingly, reflexive operators aren't required. So, if <> is searchable, you can have the opposite of unique: all values must be the same. That might be interesting for something like: EXCLUSION(room CHECK WITH =, during CHECK WITH &&, student_grade CHECK WITH <>) To ensure that a shared room isn't shared between students of different grades. Not the most compelling use case, but I could imagine something along these lines being useful. Maybe a better example would involve sheep and wolves ;) > How would I specify a tree that has only 2 down branches at any node, > 'left' and 'right'? I'm not sure I understand this exactly. If the left or right is explcitly a part of the tuple, I think it can be done with unique. If not, and you're looking for a maximum of two tuples, you can see my ill-fated extension to this feature here: http://archives.postgresql.org/pgsql-hackers/2009-11/msg00016.php As Tom pointed out, most use cases would not have a constant limit throughout the table. If you do have such a use case, you can revive the proposal. > Not sure that if we submitted this to SQL > Standard committee that it would be accepted as is. There are implementation details bubbling up to the user-visible behavior, and I share your concern. The SQL committee would never care about these implementation details, and I wish we didn't have to, either. The machinism that I've employed searches (using a dirty snapshot) for all of the physical tuples that cause a logical conflict with the physical tuple currently being added. If found, it uses physical information from those tuples, like visibility information, to determine whether to wait, and on whom to wait. After waiting it may either proceed or abort. If we move closer to a nice, clean query to express the constraint, it gets very difficult to tell which physical tuple is responsible for the conflict. If we don't know what physical tuple is causing the conflict, we have to serialize all writes. Additionally, the more it looks like a query, the more we have to tell users "follow this template" -- which will just lead to confusion and disappointment for users who think we've implemented SQL ASSERT (which we haven't). Although the current syntax isn't great, it is declarative, and it does allow a variety of constraints. I certainly welcome ideas that will make a better trade-off here. At the end, I just want a feature that can implement temporal keys. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: operator exclusion constraints with cardinality
On Sun, 2009-11-01 at 23:09 -0500, Robert Haas wrote: > Following that thought, in this particular case it seems like you could do: > > EXCLUSION (room CHECK WITH =, > attendee CHECK WITH =, > foobar CHECK WITH =, > during CHECK WITH &&) > and then also > CHECK (foobar >= 1 AND foobar <= 30) ... > Since you already have attendee as part of the > constraint, I'm a little mystified as to what the quantity of 30 is > supposed to represent, Yes, that was wrong, attendee shouldn't have been in the constraint. > but it any case it seems like you can get the > effect with an extra field That's one way to do it, but then you have to assign numbers to all attendees, which creates a new contention problem. If using discrete time slots aligned to the hour, for instance, you could increment a field in the "room" table every time an attendee was added, and then use a CHECK constraint to limit that field. The fact that updates follow the chain of concurrent updates makes that work nicely. That doesn't seem to work for general overlapping and unaligned time periods, however. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: operator exclusion constraints with cardinality
On Sun, 2009-11-01 at 22:49 -0500, Tom Lane wrote: > you should have no hard-wired upper bound on > how many of them you allow. You're right. I saw something that looked easy to implement, but in practice it wouldn't be very useful. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal: operator exclusion constraints with cardinality
I have completed the open issues (that I'm aware of) for operator exclusion constraints: http://archives.postgresql.org/message-id/1257101600.27737.159.ca...@jdavis I'd now like to propose an extension to that feature: cardinality limits. It could go something like this (syntax still open for discussion, for illustration only): EXCLUSION (room CHECK WITH =, attendee CHECK WITH =, during CHECK WITH &&) CARDINALITY 30 To mean that at most 30 attendees can be signed up for the same room at overlapping times. I have hacked up a basic prototype just to make sure the semantics can work reasonably well. I haven't implemented the syntax or catalog changes yet, but the basic semantics with a hard-coded cardinality seem to hold up. Thoughts? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sun, 2009-11-01 at 22:42 +, Simon Riggs wrote: > After reading the docs in the patch I don't believe you're going to all > this trouble to ensure two circles don't overlap. Can you give some > better examples of what you're trying to achieve and why anyone else > would care? (I'm busy, so are others). Non-overlapping periods of time. I couldn't document that, because the PERIOD type doesn't exist in core (yet). > I can probably guess, but my feeling is I shouldn't have to. I feel like > this might be a truly great feature, but I'm worried that either it > isn't at all or it is and yet will be overlooked. Does this project link > in with other planned developments in various plugins? Absolutely: http://archives.postgresql.org/pgsql-hackers/2009-10/msg01813.php > The current patch writes the syntax like this > EXCLUSION USING gist (c CHECK WITH &&) > makes it look like a table constraint, yet it clearly refers to a single > column. That looks very clumsy to read, to my eyes. It is a table constraint, and you can specify multiple columns. I don't see much point in allowing this as a column constraint, because that's not the typical case. Most of the time, there will be two columns like: EXCLUSION(room_number CHECK WITH =, during CHECK WITH &&) In other words, usually there is both a resource and a period of time for the reservation. It is of course possible to use it for a column constraint, and I'll add syntax if there's demand for it. > The syntax be easier to read if it was stated as a comparison > e.g. in the circle example > CHECK ( NOT (NEW.c && c)) USING GIST > where NEW is the incoming row. > This is similar to the way I would write the constraint if I wanted to > ensure the values in two columns did not match/overlap etc > CHECK ( NOT (col1 && col2)) > and is also not such a radical departure from existing SQL Standard > syntax. We've already had very extensive discussion about the syntax. Your idea is interesting, but I agree with Tom that it's not ideal, either. NEW might be OK, but Tom's observation about the new meaning of "c" (ranging over the entire table) is a compelling problem. Consider: CHECK ( NOT (NEW.c && c OR c && d)) The right side of the OR could either mean "c overlaps d" or "forall c, d: c overlaps d". I can't come up with a way to treat "c" consistently between the left and right side of the OR (put another way, is "c" free or bound?). We could allow subselects in CHECK, but it's difficult to infer from arbitrary queries what I can enforce with an operator exclusion constraint, and what I can't. If you want to re-open the syntax discussion, we can (right is better than soon). However, it is late in the cycle, so I'll need something very clear quite soon if this is going to make it into 8.5. Personally I think the current syntax is pretty good. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syntax for partitioning
On Fri, 2009-10-30 at 17:39 -0400, Robert Haas wrote: > IMO, the real problem is that the type interface is poorly > encapsulated. There's way too much code that knows about the internal > details of a type - namely, that it's a 32-bit integer modified by a > second 32-bit integer. I think there are still places where the code > doesn't even know about typmod. If we're going to go to the trouble > of changing anything, I think it should probably involve inserting an > abstraction layer that will make future extensions easier. But I have > a feeling that's going to be a tough sell. Yeah. We're way off topic for partitioning, so I think it's best to just table this discussion until someone comes up with a good idea. It's not the end of the world to write some generic C code, and have multiple types make use of it, e.g. PERIOD, PERIODTZ, INT4RANGE, FLOAT8RANGE, etc. It's a little redundant and creates some catalog bloat, but I'm not too concerned about it right now. Certainly not enough to rewrite the type system. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syntax for partitioning
On Fri, 2009-10-30 at 19:12 +0200, Heikki Linnakangas wrote: > Wait, it doesn't? A typmod is a 32-bit integer, like Oids. Am I missing > something? Oid is unsigned, typmod is signed. We might be able to get away with it, but -1 is treated specially in some places outside of the type-specific functions, e.g. exprTypmod(). I haven't looked at all of these places yet, so maybe a few simple changes would allow us to treat typmod as a full 32 bits. Or perhaps it could just be expanded to a signed 64-bit int. What do you think? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syntax for partitioning
On Fri, 2009-10-30 at 10:03 +0200, Peter Eisentraut wrote: > I can't help but wonder if the period type might better be a generic > container for pairs of scalar, totally-ordered types. That would be ideal. However, it doesn't really look like our type system was built to handle that kind of thing. We could use typmod, I suppose, but even that won't hold a full Oid. Any ideas/suggestions? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syntax for partitioning
On Fri, 2009-10-30 at 00:10 +0200, Peter Eisentraut wrote: > On tor, 2009-10-29 at 11:15 +0900, Itagaki Takahiro wrote: > > Range partitioning: > > CREATE TABLE table_name ( columns ) > > PARTITION BY RANGE ( a_expr ) > > ( > > PARTITION name VALUES LESS THAN [(] const [)], > > PARTITION name VALUES LESS THAN [(] MAXVALUE [)] -- overflow partition > > ); > > Maybe this needs to mention the actual operator name instead of LESS > THAN, in case the operator is not named < or the user wants to use a > different one. I can't help but wonder if the PERIOD type might be better for representing a partition range. It would make it easier to express and enforce the constraint that no two partition ranges overlap ;) Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - temporal contrib module
On Thu, 2009-10-29 at 10:54 +0200, Heikki Linnakangas wrote: > I presume you're going to need some backend support and possibly new > syntax for some of the operations, right? That seems more urgent to > discuss than the possible inclusion into contrib. There are various areas that need work inside the backend: * Semantics 1. Allow temporal keys -- this is accomplished via operator exclusion constraints, which I hope can be committed in the next commit fest. 2. Allow postgres to understand types like PERIOD, so that it can find important operators like "@>", "&&", "<<". * Syntax Sugar 1. Temporal keys 2. Temporal FKs 3. Temporal Join 4. Creating a simple audit log 5. Possible PERIOD constructor syntax sugar? * Performance 1. Modified merge join And I believe the rest can be done using the existing type infrastructure, i.e. as a contrib module. I think it makes a lot of sense to discuss and develop the backend changes and PERIOD data type in parallel. > Hmm. Infinity feels like a better match. The behavior of length and set > operations falls out of that naturally. For example, length of a period > with an infinite beginning or end is infinite. For set operations, for > example the intersection of [123, infinity] and [100, 160] would be > [123, 160]. I agree. If TSQL-2 addresses NULL semantics clearly enough, we might want to allow it. I think it will just cause confusion, however. > I'd stick to your current definition that a period is a contiguous set > of time. A non-contiguous set consists of multiple contiguous periods, > so it can be represented as multiple rows in a table. I think there is a lot of value in non-contiguous sets, but PERIOD is a good start. > It should also be kept in mind that although this class of problems are > generally thought of as temporal issues, IOW dealing with time, the same > approach works with ranges of integers or any other datatype with a > well-defined sort order. Agreed. > It would be nice if the temporal data type > would allow that too. If I understand what you're saying, you're alluding to a type where you can do things like: RANGE(timestamptz) which would be equivalent to a PERIOD. Typmod almost provides enough flexibility, but it can't store a full OID, so we'd need to get creative. There are probably some other issues here as well, because the current type system isn't really designed for this kind of thing. Do you have any ideas or guidance here? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - temporal contrib module
On Thu, 2009-10-29 at 09:37 +, Richard Huxton wrote: > There are cases where one time is genuinely unknown, and there we need > a null. The semantics of a period with one side NULL require a more clear definition. I don't personally see a lot of utility in trying to support NULL semantics, but if we want to support it, it needs to be clearly defined. Does TSQL-2 offer any guidance here? > That's assuming we've got a guarantee > that from<=to for all periods. Of course. Except that means that a NULL on one side of a period is a little less unknown than other kinds of NULLs ;) Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - temporal contrib module
On Thu, 2009-10-29 at 00:31 -0700, Scott Bailey wrote: > Nulls - A common use case for periods is for modeling valid time. Often > the end point is not known. For instance, you know when an employee has > been hired but the termination time typically wouldn't be known ahead of > time. We can either represent these with a null end time or with > infinity. But I'm not sure how to deal with them. Obviously we can test > for containment and overlap. But what about length or set operations? I think we should allow the database designer flexibility here. For instance, there are good arguments for using separate relations for things with a known begin time and no known end, and things with a definite begin and end time. However, infinity also makes sense, particularly in cases where something that is true can never again be false. NULL support is a little stranger. We can allow it by extending the data representation, but the semantics might not match what people expect from NULL. Should it be treated like a NULL in an array, or a NULL in a record value, or what? If we allow NULL on one side of a period, that may require some backend support, depending on the semantics. My feeling right now is to not provide a way for one side of the period to be NULL, but if we come up with clear enough semantics maybe it's useful. I tend to think it would cause more confusion than anything. For any kind of built-in audit log functionality (transaction-time based), I don't see any utility for NULL. > Temporal Keys - We need two types of temporal keys. A primary key, > exclusion type prevents overlap so someone isn't at two places at the > same time. And a foreign key, inclusion type so we can check that the > valid time of a child is contained with in the valid time of the parent. > Jeff is working on the former, but there is no easy way to do the latter. I believe temporal foreign keys can be implemented the same way foreign keys are now (except with "contained by" rather than "equals"). We should provide some support to make that easier, but I don't think that's a major issue. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1
On Wed, 2009-10-28 at 21:09 +0200, Hannu Krosing wrote: > Is at least the fact that they "are undocumented, have changed in the > past, and are likely to change again in the future" documented ? That's a little confusing to me: how do we document that something is undocumented? And where do we stop? > Hashing is a quite fundamental thing in computing, so I was quite > surprised to find out it had silently changed. There are many reasons to use a hash, and we don't want people to use these functions for the wrong purpose. I have seen people use a performance hash for security purposes before, and I had to demonstrate some hash collisions to show why that was a bad idea. So, if we do provide documented functions, it should be done carefully. Trying to develop and document a set of standardized, stable hash functions covering a wide range of possible use cases sounds like it may be better served by an extension. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Scaling up deferred unique checks and the after trigger queue
On Mon, 2009-10-26 at 17:23 +, Dean Rasheed wrote: > If it's of any relevance, I'm currently using an optimised build, with > assert checking off. > [Linux x86_64, 2 core Intel Core2] Ok, I'm able to reproduce it now. Thanks for looking into it! Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "toast.fillfactor" is documented but not recognized?
On Mon, 2009-10-26 at 19:11 -0300, Alvaro Herrera wrote: > We explicitely disallow setting fillfactor on toast tables. So should that be made more clear in the documentation? http://www.postgresql.org/docs/8.4/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS Looking at that page briefly I would assume that it could be set. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Scaling up deferred unique checks and the after trigger queue
On Mon, 2009-10-26 at 13:41 +, Dean Rasheed wrote: > I did a quick bit of testing, and I think that there is a > locking/concurrency problem :-( Unfortunately I can't reproduce the problem on my machine; it always passes. If you have a minute, can you try to determine if the problem can happen with a non-deferrable constraint? I'll keep looking into it. Thanks, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Scaling up deferred unique checks and the after trigger queue
On Mon, 2009-10-19 at 17:48 +0100, Dean Rasheed wrote: > This is a WIP patch to replace the after-trigger queues with TID bitmaps > to prevent them from using excessive amounts of memory. Each round of > trigger executions is a modified bitmap heap scan. Can you please take a look at my patch here: http://archives.postgresql.org/message-id/1256499249.12775.20.ca...@jdavis to make sure that we're not interfering with eachother? I implemented deferred constraint checking in my operator exclusion constraints patch (formerly "generalized index constraints"). After looking very briefly at your approach, I think that it's entirely orthogonal, so I don't expect a problem. I have a git repo here: http://git.postgresql.org/gitweb?p=users/jdavis/postgres.git;a=shortlog;h=refs/heads/operator-exclusion-constraints which may be helpful if you just want to look at the commit for deferred constraint checking. Any comments welcome. I'll also take a look at your patch in the next few days. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pre-proposal: type interfaces
On Fri, 2009-10-23 at 16:25 -0400, Tom Lane wrote: > Forgot to mention: I do not think default-ness of opclasses enters > into it at all. The meaning of the query is fully defined by the > operator that is in it. All we need to know is what are the > semantics of that operator. If we can find it in the "overlaps" > position of *any* opclass, we are entitled to suppose that it > behaves like overlaps and the associated left-of operator can be > used to optimize it. Interesting, that sounds we've got a good approach to the problem now. This thread has been useful. > Conceivably we could get different left-of > operators out of different opclasses, but if they don't behave > effectively the same, the user has messed up the opclasses. It would probably be worthwhile to make an attempt to throw a useful error there, but I agree it's not really a problem. > The case where default-ness of opclasses matters is where we are > trying to assign specific meaning to some generic construct like > DISTINCT or ORDER BY. For instance, it makes sense to require > that ORDER BY be interpreted by reference to a default opclass, > because otherwise you don't have a way to know which sort ordering > the user wants. That makes sense. Thanks, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pre-proposal: type interfaces
I am starting to plan a few features that are important for temporal data, and one prerequisite for several of them is the ability to find an operator that fills a certain role. For instance, one feature that I'm considering now is a "temporal join" which is a join on "overlaps" rather than "equals", e.g.: SELECT * FROM a, b WHERE a.x && b.x; I might try to provide a modified merge join algorithm to implement that more efficiently in some cases. I don't mean to discuss the feature in detail here (I will make a separate proposal) but the algorithm requires that I find the "strictly left of" operator. So, after I recognize that a temporal join is required, I need to be able to identify the << operator. But how? In other languages, it would probably be done with something like an "interface", but we don't have that concept here. The internals generally use operators attached to the default btree opclass, but I don't think that works very well here. The way I see it, we have two approaches: 1. Try to make the current system work by standardizing the strategy numbers for GiST somehow, and then use the default GiST operator class, if available. 2. Invent a new system, perhaps interfaces, perhaps something else. 3. Use extra flags in CREATE OPERATOR somehow Thoughts? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-tree leaf node structure
On Wed, 2009-10-21 at 23:55 -0700, edwardyf wrote: > If the index is on an attribute with duplicate values. will it be: > 1) one index tuple for each row, though with the same value, or > 2) one index tuple for each value, containing a list of row ids. As Tom already pointed out, #1 is the answer. However, I'd like to add that there's a feature that never quite made it called Grouped Index Tuples (GIT) that might still be viable: http://community.enterprisedb.com/git/ Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Going, going, GUCs!
On Tue, 2009-10-20 at 10:49 -0700, David Fetter wrote: > synchronize_seqscans (should be on) Right now this is used for pg_dump, because pg_dump could un-cluster a previously clustered table (I believe Greg Stark made this observation). This is actually a stats/planner issue more than anything else, because the table isn't _really_ unclustered, but it is still an issue (seems minor to me, but not insignificant). Also, it seems reasonable that testers might want to use something like this, if they don't want to ORDER BY. For instance, if testing postgresql itself, ORDER BY would change what you're testing. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per table random-page-cost?
On Mon, 2009-10-19 at 21:22 -0500, Kevin Grittner wrote: > I'd bet accounts receivable applications often hit that. > (Most payments on recent billings; a sprinkling on older ones.) > I'm sure there are others. You worded the examples in terms of writes (I think), and we're talking about read caching, so I still don't entirely understand. Also, the example sounds like you'd like to optimize across queries. There's no mechanism for the planner to remember some query executed a while ago, and match it up to some new query that it's trying to plan. Maybe there should be, but that's an entirely different feature. I'm not clear on the scenario that we're trying to improve. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per table random-page-cost?
On Mon, 2009-10-19 at 16:39 -0700, Greg Stark wrote: > But the long-term strategy here I think is to actually have some way > to measure the real cache hit rate on a per-table basis. Whether it's > by timing i/o operations, programmatic access to dtrace, or some other > kind of os interface, if we could know the real cache hit rate it > would be very helpful. Maybe it would be simpler to just get the high-order bit: is this table likely to be completely in cache (shared buffers or os buffer cache), or not? The lower cache hit ratios are uninteresting: the performance difference between 1% and 50% is only a factor of two. The higher cache hit ratios that are lower than "almost 100%" seem unlikely: what kind of scenario would involve a stable 90% cache hit ratio for a table? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] contrib/plantuner - enable PostgreSQL planner hints
On Fri, 2009-10-16 at 12:04 -0500, decibel wrote: > I'm guessing you couldn't actually do that in just a contrib module, > but it's how Oracle handles hints, and it seems to be *much* more > convenient, because a hint only applies for a specific query. If that's the only reason, that seems easy enough to solve by using SET right before the query. SET LOCAL might be convenient if you want to forget the setting after the query. Connection pool software will do a RESET ALL anyway. There are reasons that it might be convenient to use hints inside the query itself -- for instance, if you want something to apply only to a subquery. I'm still hoping that someone will come up with a more elegant solution to solve that problem though. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 17:54 -0400, Tom Lane wrote: > (I'm wondering a bit if anyone will want a WHERE clause, too, though > adding that later shouldn't pose any big syntactic obstacles.) Where should I put the WHERE clause? My current syntax (with patch) is: [ CONSTRAINT constraint_name ] EXCLUSION [USING index_method] (expression CHECK WITH operator [, ...]) index_parameters } [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] [ WHERE predicate ] That's a little awkward to document, because WHERE is only supported for operator exclusion constraints, so it doesn't just fit along side CHECK and FOREIGN KEY. My only concern is that it would make the CREATE TABLE syntax slightly harder to read. We could allow the WHERE clause to be syntactically correct for all the other constraints, but throw a "not implemented" error if they try to use it. I'm not sure if that fits nicely with the spec or not. I tried to move the WHERE clause right before or after the index_parameters, but that resulted in shift/reduce conflicts. Thoughts? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] alpha 2 release notes
On Wed, 2009-10-14 at 17:35 +0300, Peter Eisentraut wrote: > In general, do we want the alpha 2 release notes as a separate section > above alpha 1, or merged with the alpha 1 notes thereby giving a > cumulative view of what happened since 8.4? Cumulative makes sense to me. Even if people are trying out all of the alpha releases, they might have forgotten to try some of the features from alpha1. That being said, it's probably a good idea to highlight or mark the features new to alpha2 somehow, perhaps with a "*". Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [GENERAL] contrib/plantuner - enable PostgreSQL planner hints
On Tue, 2009-10-13 at 09:14 -0500, Kevin Grittner wrote: > Now that we can generate EXPLAIN output in more structured formats, > perhaps we could think about adding an "extremely verbose" mode where > the planner would "think out loud" as a whole separate section from > where we show the chosen plan? Tom Raney did that a while back: http://archives.postgresql.org/pgsql-patches/2008-07/msg00011.php He also had an accompanying visual tool to navigate the output in a meaningful way. If he has moved on to other projects, it would be great if someone could pick it up. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] missing entry in GucSource_Names
It appears that the patch here: http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=a30fa4ca33d055c46bebc0e5c701d5b4fd27814d missed adding PGC_S_DATABASE_USER to a few locations, most notably GucSource_Names, where the PGC_S_SESSION now points off the end of the array. Patch attached. Regards, Jeff Davis diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 1f63e06..776efe3 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -460,6 +460,7 @@ const char *const GucSource_Names[] = /* PGC_S_ARGV */ "command line", /* PGC_S_DATABASE */ "database", /* PGC_S_USER */ "user", + /* PGC_S_DATABASE_USER */ "database user", /* PGC_S_CLIENT */ "client", /* PGC_S_OVERRIDE */ "override", /* PGC_S_INTERACTIVE */ "interactive", @@ -4556,7 +4557,8 @@ set_config_option(const char *name, const char *value, */ elevel = IsUnderPostmaster ? DEBUG3 : LOG; } - else if (source == PGC_S_DATABASE || source == PGC_S_USER) + else if (source == PGC_S_DATABASE || source == PGC_S_USER || + source == PGC_S_DATABASE_USER) elevel = WARNING; else elevel = ERROR; @@ -5762,6 +5764,7 @@ define_custom_variable(struct config_generic * variable) break; case PGC_S_DATABASE: case PGC_S_USER: + case PGC_S_DATABASE_USER: case PGC_S_CLIENT: case PGC_S_SESSION: default: -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] transaction_isolation vs. default_transaction_isolation
On Mon, 2009-10-12 at 22:13 -0700, Josh Berkus wrote: > However, for *two* settings, and two settings only, we distinguish that > by naming an identical setting "default_*" in postgresql.conf. This is > confusing and inconsistent with the rest of the GUCS. Namely: > > default_transaction_isolation > default_transaction_read_only I think they are named "default_" because whatever you specify at the beginning of a transaction overrides the GUC. For example, in: BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; SET default_transaction_isolation=serializable; ... the "default_" makes it more clear which setting overrides the other. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writeable CTEs and side effects
On Fri, 2009-10-09 at 02:23 +0300, Peter Eisentraut wrote: > INSERT INTO tab1 SELECT ... FROM tab1 > > clearly requires the SELECT to be distinctly before the INSERT. That's effectively only one thing: assigning a relation (the result of the select) to a variable (tab1). I was talking about multiple assignment. What if you want to append foo to bar and bar to foo? WITH t1 AS (INSERT INTO foo SELECT * FROM bar), t2 AS (INSERT INTO bar SELECT * FROM foo) VALUES(1); That could be an interesting command if we didn't increment the command counter. > SELECT * FROM test1 WHERE a IN (UPDATE test2 SET b = b + 1 RETURNING b); > > I think I'd want "writable subqueries" instead of only "writable CTEs". I think the original motivation was that it's more clear that a CTE is separated and can only be executed once (if it has side effects). Depending on how the query is written, it might be less obvious how many times the subquery should be executed, and it might change based on the plan. We could make the same rules for a subquery that has side effects, and always materialize it. But for now maybe CTEs are a better place to get the feature working. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writeable CTEs and side effects
On Thu, 2009-10-08 at 15:11 -0400, Tom Lane wrote: > > I'm not sure if this is a problem, but it seems like we're essentially > > allowing a complex transaction to take place in one statement. Is that > > what we want? > > Yeah, I think that's more or less the point ... I'm still trying to ponder the consequences of this. Most people assume that a single statement means that everything in the statement happens at once (intuitively). The few cases where that's not true are special commands or things that we are trying to fix, like: "UPDATE foo SET a = a + 1". I get the feeling that we're turning a declarative statement into something more procedural. I suppose one difference between this and a BEGIN ... END block would be that the isolation from other transactions would always be SERIALIZABLE. I can't clearly articulate a problem with any of these things, but it does seem vaguely troubling. Also, are we missing out on an opportunity to provide some interesting functionality if we do treat two DML statements as happening simultaneously? I've read some interesting perspectives on this in the past, and it's not trivial, but we might want to leave the possibility open. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writeable CTEs and side effects
On Thu, 2009-10-08 at 12:57 -0400, Tom Lane wrote: > I also agree with bumping the CID in between. Do you mean bump the CID in between each DML statement, or between the last DML statement and the main query? If the former, how should we choose the order of execution? I'm not sure if this is a problem, but it seems like we're essentially allowing a complex transaction to take place in one statement. Is that what we want? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Thu, 2009-10-08 at 09:44 -0700, David E. Wheeler wrote: > +1 Thanks for getting this done. > > Now, does this just apply to PL/pgSQL? If so, what needs to happen for > other PLs to support the feature? It's just the call notation -- the function only needs to know what arguments it got for which parameters. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Wed, 2009-10-07 at 18:17 -0400, Tom Lane wrote: > No, that's not what I'm driving at. The small change that I've got in > mind would require you to say VARIADIC, but it would allow the function > to match both the above call and > foo(a AS x, c AS z, VARIADIC b AS y) > when really z is the variadic parameter in this case. I'm not sure if > this would bother anyone or not. It seems impossible that a function > could ever have more than one variadic parameter, so there's not really > any ambiguity from maintaining the syntactic rule that the VARIADIC > keyword is at the end even when the variadic argument isn't, but it > might look a bit odd. I'm worried about allowing such strange notation. Someone might have a new idea later that conflicts with it, and then we have a backwards-compatibility problem. > What I *don't* want to do is fix this by allowing/requiring > foo(a AS x, VARIADIC c AS z, b AS y) > because it would be a bigger change in the grammar output structure than > seems warranted. If it's the "right" thing to do (or might be the right thing to do), someone will want to do that later, and that would be incompatible with the: foo(a AS x, c AS z, VARIADIC b AS y) notation (where z is the variadic parameter). > We could possibly have VARIADIC throw an error if the > named argument that matches to the variadic parameter isn't the last > one, but I'm not sure that that's important rather than just pedantry. I would prefer such a restriction if it's reasonable to do. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Wed, 2009-10-07 at 17:45 -0400, Tom Lane wrote: > I think what he was considering was the question of insisting that > the VARIADIC keyword be attached to the named argument that actually > matches the VARIADIC parameter. I think we could do it, but it might > be a bit of a wart. I notice that right now, an unnecessary VARIADIC > keyword in a regular positional call does not cause an error, it's just > ignored --- so we're already being a bit lax with it. >From a semantic standpoint, I lean towards requiring the VARIADIC keyword consistently between named and positional notation. It seems strange to me if we have a situation where changing the call: foo(a, b, VARIADIC c) to be more explicit by using named call notation: foo(a AS x, b AS y, VARIADIC c AS z) is "less correct" in the sense that the VARIADIC keyword goes from "required" to "ignored". Also, requiring VARIADIC seems to guard us better against future changes, which seemed like a concern before. I don't have a strong opinion or a specific problem with making VARIADIC optional, so it's OK with me. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Wed, 2009-10-07 at 23:32 +0200, Pavel Stehule wrote: > It's same as my origin ideas, much better formulated. It is ok for me. You indicated that there may be some implementation difficulty if the VARIADIC keyword is required for calling using named notation: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01792.php Do you think it would be reasonable to implement? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Wed, 2009-10-07 at 16:58 -0400, Tom Lane wrote: > * completely ignores variadic functions when trying to match > a call having any named arguments > > * does not throw an error for use of the VARIADIC keyword > in a call together with named arguments > > Neither of these behaviors quite seem to me to satisfy the principle of > least astonishment, and in combination they definitely do not. I agree that the combination is wrong, and we should either allow that call notation or throw a useful error message when it's attempted. > It seems to me that there is not anything wrong with using named > arguments together with VARIADIC and getting a match to a variadic > function. The general feeling was that we should only support the most obvious call notations so we wouldn't have backwards compatibility problems if we tried to change it later. If we allow calling a variadic function using named notation, the VARIADIC keyword is not strictly necessary, but I think we should require it. It seems strange without it. Pavel indicated that there may be some implementation difficulty in requiring the VARIADIC keyword when calling a variadic function using named notation: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01792.php and that just kind of pushed the idea from "maybe that's OK" to "probably not a good idea right now". Robert Haas weighed in here: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01732.php Its fine with me to allow it, assuming there's a reasonable way to implement it, and assuming that the VARIADIC keyword is required. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY enhancements
On Fri, 2009-09-25 at 10:01 -0400, Emmanuel Cecchet wrote: > Robert, > > Here is the new version of the patch that applies to CVS HEAD as of this > morning. I just started looking at this now. It seems to fail "make check", diffs attached. I haven't looked into the cause of the failure yet. Regards, Jeff Davis *** /home/jdavis/wd/git/postgresql/src/test/regress/expected/copy_errorlogging.out 2009-10-04 10:24:15.0 -0700 --- /home/jdavis/wd/git/postgresql/src/test/regress/results/copy_errorlogging.out 2009-10-04 10:24:32.0 -0700 *** *** 46,58 -- both \n and \r\n present COPY foo FROM '/home/jdavis/wd/git/postgresql/src/test/regress/data/foo_malformed_terminator.data'; - ERROR: literal carriage return found in data - HINT: Use "\r" to represent carriage return. - CONTEXT: COPY foo, line 2: "" SELECT count(*) from foo; count --- ! 0 (1 row) -- create error logging table --- 46,55 -- both \n and \r\n present COPY foo FROM '/home/jdavis/wd/git/postgresql/src/test/regress/data/foo_malformed_terminator.data'; SELECT count(*) from foo; count --- ! 5 (1 row) -- create error logging table *** *** 75,81 SELECT count(*) from foo; count --- ! 4 (1 row) DELETE FROM foo; --- 72,78 SELECT count(*) from foo; count --- ! 9 (1 row) DELETE FROM foo; *** *** 101,113 DELETE FROM foo; -- error logging skipping tuples: both \n and \r\n present COPY foo FROM '/home/jdavis/wd/git/postgresql/src/test/regress/data/foo_malformed_terminator.data' (ERROR_LOGGING, ERROR_LOGGING_SKIP_BAD_ROWS); - ERROR: literal carriage return found in data - HINT: Use "\r" to represent carriage return. - CONTEXT: COPY foo, line 2: "" SELECT count(*) from foo; count --- ! 0 (1 row) DELETE FROM foo; --- 98,107 DELETE FROM foo; -- error logging skipping tuples: both \n and \r\n present COPY foo FROM '/home/jdavis/wd/git/postgresql/src/test/regress/data/foo_malformed_terminator.data' (ERROR_LOGGING, ERROR_LOGGING_SKIP_BAD_ROWS); SELECT count(*) from foo; count --- ! 5 (1 row) DELETE FROM foo; == -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Fri, 2009-10-02 at 16:06 +0200, Pavel Stehule wrote: > see attachment, please Thank you, marked as "ready for committer". Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Thu, 2009-10-01 at 17:56 +1000, Brendan Jurd wrote: > I've had a look through the documentation and cleaned up a few things. Thanks, that is helpful. I have included that along with some other comment updates in the attached patch. Paval, In ProcedureCreate(), you match the argument names to see if anything changed (in which case you raise an error). The code for that looks more complex than I expected because it keeps track of the two argument lists using different array indexes (i and j). Is there a reason it you can't just iterate through with something like: if (p_oldargmodes[i] == PROARGMODE_OUT || p_oldargmodes[i] == PROARGMODE_TABLE) continue; if (strcmp(p_oldargnames[i], p_names[i]) != 0) elog(ERROR, ... I'm oversimplifying, of course, but I don't understand why you need both i and j. Also, there is some unnecessarily verbose code like: if (p_modes == NULL || (p_modes != NULL ... In func_get_detail(), there is: /* shouldn't happen, FuncnameGetCandidates messed up */ if (best_candidate->ndargs > pform->pronargdefaults) elog(ERROR, "not enough default arguments"); Why is it only an error if ndargs is greater? Shouldn't they be equal? /* * Actually only first nargs field of best_candidate->args is valid, * Now, we have to refresh last ndargs items. */ Can you explain the above comment? Please review Brendan and my patches and combine them with your patch. Regards, Jeff Davis diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 50c4128..542646d 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1505,6 +1505,10 @@ sqrt(2) The list of built-in functions is in . Other functions can be added by the user. + + + See for more details. + diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index 9e8ccfa..1c06885 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -651,21 +651,19 @@ FuncnameGetCandidates(List *names, int nargs, List *argnames, int effective_nargs; int pathpos = 0; bool variadic; - bool use_defaults = false; /* be compiler quiet */ + bool use_defaults = false; /* make compiler quiet */ Oid va_elem_type; FuncCandidateList newResult; int *proargidxs = NULL; - /* Try to attach names, when mixed or named notation is used. */ + /* Try to attach names when mixed or named notation is used. */ if (argnames != NIL) { /* - * Mixed or named notation + * Mixed or named notation * - * We would to disable an call of variadic function with named - * or mixed notation, because it could be messy for users. We - * would to allow only unique arg names, and this is useles for - * variadic functions. + * Variadic functions can't be called using named or mixed + * notation. */ if (OidIsValid(procform->provariadic)) continue; @@ -760,9 +758,12 @@ FuncnameGetCandidates(List *names, int nargs, List *argnames, newResult->nargs = effective_nargs; /* - * Wait with apply proargidxs on args. Detection ambigouos needs - * consistent args (based on proargs). Store proargidxs for later - * use. + * Save proargidxs in newResult. It's needed later to adjust + * the argument types to be the types corresponding to the + * named arguments (if any), and also to assign positions to + * any NamedArgExpr arguments after the best candidate is + * determined. The former could be done here, but we leave + * both for the caller to do. */ newResult->proargidxs = proargidxs; memcpy(newResult->args, procform->proargtypes.values, @@ -980,8 +981,9 @@ VerifyCandidateNamedNotation(HeapTuple proctup, int pronargs, int nargs, List *a Assert(p_argnames != NULL); /* - * A number less or equal nargs means explicit arguments, - */ + * pronargs equal to nargs means explicit arguments (no defaults) + */ + *proargidxs = palloc(nargs * sizeof(int)); for (i = 0; i < pronargs; i++) argfilling[i] = false; @@ -1004,7 +1006,7 @@ VerifyCandidateNamedNotation(HeapTuple proctup, int pronargs, int nargs, List *a continue; if (p_argnames[i] && strcmp(p_argnames[i], argname) == 0) { - /* protect us against duplicated entries from bad written mixed notation */ + /* protect us against duplicated entries from badly written mixed notation */ if (argfilling[pp]) return false; @@ -1035,9 +1037,13 @@ VerifyCandidateNamedNotation(HeapTuple proctup, int pronargs, int nargs, List *a ap++; } + /* + * This function is only called for named and mixed notation, and + * the last argument must be named in either case. + */ Assert(notation == CALL_NOTATION_NAMED); - /* Check for de
Re: [HACKERS] Rejecting weak passwords
On Mon, 2009-09-28 at 15:52 -0700, Josh Berkus wrote: > > It takes about 32 hours to brute force all passwords from [a-zA-Z0-9] > > of up to 8 chars in length. > > That would be a reason to limit the number of failed connection attempts > from a single source, then, rather than a reason to change the hash > function. That doesn't solve the problem of an administrator brute-forcing your password. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Mon, 2009-09-28 at 19:26 +0200, Pavel Stehule wrote: > > Also, you should consistently pass NIL when you mean an empty list, but > > sometimes you pass NULL to FuncnameGetCandidates(). > > It's bug, where is it? In regproc.c. Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Mon, 2009-09-28 at 18:23 +0200, Pavel Stehule wrote: > when I though about control, I found so syntax with mandatory VARIADIC > is difficult implementable. So probably the most feasible solution for > this moment is to discard a variadic functions from set of functions > that are callable with named notation. So I thing we are in tune, and > I am going to update patch. Sounds good. I am looking at the code, and there's a part I don't understand: In FuncnameGetCandidates(): /* * Wait with apply proargidxs on args. Detection ambigouos needs * consistent args (based on proargs). Store proargidxs for later * use. */ newResult->proargidxs = proargidxs; But after calling FuncnameGetCandidates (the only place where fargnames is non-NIL), you immediately re-assign to best_candidate->args. What happens between those two places, and why can't it happen in FuncnameGetCandidates? Also, you should consistently pass NIL when you mean an empty list, but sometimes you pass NULL to FuncnameGetCandidates(). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Mon, 2009-09-28 at 11:50 +0200, Pavel Stehule wrote: > This is maybe too strict. I thing, so safe version is allow variadic > packed parameter with VARIADIC keyword as Jeff proposes. The combination of variadic parameters and named call notation is somewhat strange, on second thought. Can you identify a use case? If not, then it should probably be blocked in this version of the patch. Even if it makes sense from a syntax standpoint, it might be confusing to users. Robert, did you have a specific concern in mind? Do you see a behavior there that we might want to change in the future? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sun, 2009-09-27 at 22:40 -0400, Robert Haas wrote: > Apparently, CommitFest > no longer means a time when people put aside their own patches to > review those of others; it seems now to mean a time when 87% of the > patch authors either continue development or ignore the CommitFest > completely. Well, I'm not completely innocent here, either. I also spent time making progress on my patch, both in terms of code and discussion so that I would at least have enough information to get it ready during the next development cycle. We don't have a clear "design review" stage that allows developers an opportunity to get thorough feedback during the development cycle, so the commitfest is also the design review stage by default. I got some comments when I posted my design on 8/16, but it didn't really get hashed out until a month later when the commitfest was underway. The ideal is to propose, design, implement, and then submit for the commitfest. The reality is that the commitfest is often the first time the developer gets thorough feedback on the design. So, as a developer, I'm hesitant to polish a patch before the commitfest because I know significant changes will be required. Hopefully that didn't waste too much of Brendan's time. That's just an observation from my experience with my patch. I know it's easy to point at little inefficiencies from afar, so I'm not suggesting we change our process. Overall, real progress is being made for the project in general and my patch in particular, so I'm more than willing to set my minor frustrations aside as long as that continues. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
On Sun, 2009-09-27 at 21:38 -0400, Robert Haas wrote: > In that case, I think we should target this for the next CommitFest. > Especially given the number and complexity of the patches remaining > for this CommitFest, I feel very uncomfortable with the idea of > waiting another week for a new patch version, and then possibly still > needing further changes before it is finally committed. While we > allow patches to be resubmitted for the same CommitFest, this is > intended to be for minor adjustments, not significant rewrites. OK, I expected that to be the case. I got significant feedback at the beginning of this commitfest that required some substantial language changes. I did find this commitfest extremely productive for my feature. Right now I'm trying to provide some useful feedback to Paval for his patch. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issues for named/mixed function notation patch
On Tue, 2009-09-15 at 10:51 +0200, Pavel Stehule wrote: > My renonc, please, try new patch. I forgot mark regproc.c file. I think the documentation around calling functions is disorganized: Variadic functions, functions with defaults, SRFs, out parameters, and polymorphism are all explained in 34.4, which is about SQL functions specifically. Overloading is in chapter 34 also, but not specifically in the SQL function section like the rest. Function calls themselves are only given 5 lines of explanation in 4.2.6, with no mention of things like the VARIADIC keyword. These complaints aren't about the patch, but we might want to consider some reorganization of those sections (probably a separate doc patch). The interaction with variadic functions appears to be misdocumented. >From the code and tests, the VARIADIC keyword appears to be optional when using named notation, but required when using positional notation. But the documentation says: "However, a named variadic argument can only be called the way shown in the example above. The VARIADIC keyword must not be specified and a variadic notation of all arguments is not supported. To use variadic argument lists you must use positional notation instead." What is the intended behavior? I think we should always require VARIADIC to be specified regardless of using named notation. I'm still reviewing the code. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Wed, 2009-09-23 at 15:10 +0300, Peter Eisentraut wrote: > Using CHECK as part of the syntax of an EXCLUSION constraint will surely > confuse the whole thing with CHECK constraints. > > USING OPERATOR is available, I think. USING won't work because one of the ways to specify the opclass in an index_elem is something like: CREATE INDEX foo_idx on foo (i USING int4_ops); which appears to be undocumented, and it's not obvious to me why that is useful. The normal way is just: CREATE INDEX foo_idx on foo (i int4_ops); Because I am allowing any index_elem for exclusion constraints, that conflicts with the word USING. We can either eliminate the USING variant from opt_class (unless it's necessary for some reason or I missed it in the documentation), or we can use another word (e.g. WITH or WITH OPERATOR) if you don't like CHECK. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Join optimization for inheritance tables
On Tue, 2009-09-22 at 18:16 -0400, Emmanuel Cecchet wrote: > If the partitioning implementation does not make progress (and does not > make it for 8.5), don't you think that this could be an interesting > contribution to consider for this release? > I have put on the wiki > (http://wiki.postgresql.org/wiki/Join_optimization_for_inheritance_tables) > the results obtained so far and the use case where it is most used. I think you mean that the planning time is in milliseconds, not seconds. Also, you use "data" in a few places you meant "date". The results seem good, and trading planning time for execution time seems like a reasonable idea in the case of partitioned tables. We already work harder planning when constraint_exclusion='partition', so there is some precedent (I don't know if that's a good precedent to follow or not). How does it compare to using merge-append? I haven't looked at the actual patch yet. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 19:42 -0400, Tom Lane wrote: > BTW, are you sure EXCLUSION doesn't have to become a reserved word for > this? I notice that FOREIGN, CHECK, and UNIQUE all are, which makes me > suspicious ... All of those (except FOREIGN) can be used as a column constraint as well, and that might be necessary for a reason similar to the reason I need to use a reserved word (i.e. they can come after a DEFAULT expression). Is it possible that FOREIGN doesn't really have to be a reserved word, but was just included because the others were? I'm not an expert on the matter, but it does appear to compile and recognize the grammar with EXCLUSION as an unreserved keyword. I'm in the middle of changing a lot of things around, so I can't say that it works beyond that. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 17:54 -0400, Tom Lane wrote: > Jeff Davis writes: > > 1. Constraint syntax, part of CREATE/ALTER TABLE: > > > [CONSTRAINT ] EXCLUSION ( OPERATOR , ...) > > Have you actually built this grammar? I don't think it avoids the > problem, because OPERATOR is possible within a_expr. > > Also, don't forget the possibility of wanting a nondefault opclass. > (I'm wondering a bit if anyone will want a WHERE clause, too, though > adding that later shouldn't pose any big syntactic obstacles.) I suppose I should just allow any index_elem. The only way I was able to make the grammar for that work is by using a reserved keyword. The possibilities that make the most sense to me are: index_elem WITH any_operator index_elem WITH OPERATOR any_operator index_elem CHECK any_operator index_elem CHECK OPERATOR any_operator Do any of these look acceptable? Also, I should allow for a tablespace, as well. Because it's specified with UNIQUE as "USING INDEX TABLESPACE foo", to be consistent I need to move the "USING method" ahead like so: CONSTRAINT EXCLUSION [USING method] ( CHECK , ...) [USING INDEX TABLESPACE ] [DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] Having the method before the attribute list makes it more consistent with CREATE INDEX, as well. > That's really a separate issue, but I think we need to do something to > make it more consistent. My first thought is that anything made > via CONSTRAINT syntax ought to be copied by LIKE INCLUDING CONSTRAINTS, > while LIKE INCLUDING INDEXES should copy anything you made via CREATE > INDEX. Works for me. > But note this assumes that there is a clear distinction between > the two. The constraint-depending-on-index design that you started > with would not permit such a rule, or at least it would mean that > INCLUDING CONSTRAINTS EXCLUDING INDEXES would have failure cases. Sounds reasonable. If we decide to support that kind of thing in the future, we can handle that case somehow (an error seems reasonable to me). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] operator exclusion constraints [was: generalized index constraints]
Update on operator exclusion constraints (OXC for short): After a lot of discussion, I think a lot of progress has been made. Here is my summary, please let me know if I've left anything out or not addressed some concern. 1. Constraint syntax, part of CREATE/ALTER TABLE: [CONSTRAINT ] EXCLUSION ( OPERATOR , ...) USING [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]; Table constraint syntax was chosen because of the ability to support DEFERRABLE[1] and the interest in a more declarative syntax[2]. We omitted the [INDEX ] clause because the usefulness of defining multiple constraints using one index or defining the constraint separately from the index was judged to be too marginal[3][4][5]. Some brief benchmarks showed some promise[6], perhaps interesting to explore later. Also, we introduce the OPERATOR keyword in between the expression and the operator to disambiguate the syntax[5]. Nobody has affirmed the use of OPERATOR for the disambiguation, but it seems like the obvious choice to me. 2. information_schema We omit operator exclusion constraints from the information schema, on the grounds that there is no way to represent them usefully there[7][8]. 3. Simplify the constraint checking procedure itself Tom suggested a simpler constraint-checking procedure[9]. It introduces the rare possibility of deadlocks[10], but that possibility exists for other constraints anyway[11]. My scheme for avoiding deadlocks was significantly more complex, and would become even more complex for deferrable constraints. 4. is an expression over the table's attributes and will be used to generate a functional index with the same expression to enforce the constraint. 5. We reject non-symmetric operators[12], like >, but allow non-reflexive operators[13] like <>. 6. Semantics of constraint[14] are such that for any two tuples A and B, and for a constraint: EXCLUSION (e1 OPERATOR , ..., eN OPERATOR ) the constraint is violated if: A.e1 B.e1 AND ... AND A.eN B.eN 7. LIKE is still unresolved. I don't have a strong opinion here. When INCLUDING CONSTRAINTS and INCLUDING INDEXES are both specified: a. copy all OXCs and indexes b. copy no OXCs or indexes When INCLUDING CONSTRAINTS is specified but not INCLUDING INDEXES: a. copy all OXCs and indexes b. copy no OXCs or indexes When INCLUDING INDEXES is specified but not INCLUDING CONSTRAINTS: a. copy all OXCs, including indexes b. copy all indexes created implicitly for OXCs, but not the constraints themselves c. copy no OXCs or indexes We can also emit various types of messages if we think the user is making a mistake. UNIQUE behavior here doesn't provide a good cue, because the constraint is implemented inside the index, so copying the index does copy the constraint. Brendan made a strong argument[15] that the behavior of LIKE with UNIQUE is wrong, but I don't know if we want to try to fix that now. I'd like some more input before I actually take care of this item. The rest of the issues were mostly non-controversial. I will start making some of these changes and post an updated patch and TODO list. Regards, Jeff Davis [1] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01352.php [2] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01018.php [3] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01348.php [4] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01355.php [5] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01360.php [6] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01369.php [7] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01310.php [8] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01356.php [9] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01315.php [10] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01317.php [11] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01347.php [12] http://archives.postgresql.org/pgsql-hackers/2009-09/msg00977.php [13] http://archives.postgresql.org/pgsql-hackers/2009-09/msg01039.php [14] http://archives.postgresql.org/pgsql-hackers/2009-09/msg00971.php [15] http://archives.postgresql.org/pgsql-hackers/2009-09/msg00755.php -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 13:49 -0400, Tom Lane wrote: > I'd vote for only supporting the former. Ok. I just did some brief non-scientific in-memory benchmarks. I think it has promise, but for now I think it can safely be set aside. Results appended. > What worries me more about that syntax is the postfix-operator ambiguity > --- I think it'll be hard to expand it to expressions. It might be > better to put the operator at the front; or maybe you need an extra > keyword in there. How about "OPERATOR", like: CONSTRAINT EXCLUSION ( OPERATOR , ...) USING ; I like it because it connects back to the name "operator exclusion constraint". Regards, Jeff Davis --- Results (oversimplified benchmark): As a control, two unique btrees (using old uniqueness mechanism) takes 37s. DDL (old syntax, haven't changed it yet): create table one(a int, b int, c int); create index one_a_b_c_idx on one(a,b,c); alter table one add constraint one_a_b_constr exclusion (a =, b =) using one_a_b_c_idx; alter table one add constraint one_a_c_constr exclusion (a =, c =) index one_a_b_c_idx; create table two(a int, b int, c int); create index two_a_b_idx on two(a,b); create index two_a_c_idx on two(a,c); alter table two add constraint two_a_c_constr exclusion (a =, c =) index two_a_c_idx; alter table two add constraint two_a_b_constr exclusion (a =, b =) index two_a_b_idx; Tests are of the form: -- test inserting into table with one big index with 10 "b" -- values per "a" value insert into one select g1, g2, g2 from generate_series(1,10) g1, generate_series(1,10) g2; n: number of "a" values per "b" value t1: results for one-index solution t2: results for two-index solution n t1 t2 ---+--+--- 1000 | 105s | 57s 100 | 47s | 54s 10 | 44s | 53s 1 | 42s | 56s So, the one-index solution shows about 10-20% benefit over the two-index solution when the number of "b" values per "a" value drops to around 100. Not bad, but nothing to write home about, because it's still outperformed by the existing btree enforcement mechinism. I think it has promise for some situations though; such as larger key size, leaf pages not in memory, etc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 13:28 -0400, Tom Lane wrote: > What I'm arguing for is a syntax in which the question doesn't even > arise, ie, a CONSTRAINT doesn't reference an existing index at all. > If that's not possible for whatever reason, then I think that > disallowing multiple references isn't going to buy any simplicity. I believe that syntax is possible by specifying the index access method, e.g.: CONSTRAINT EXCLUSION (a =, b &&) USING gist; versus: CONSTRAINT EXCLUSION (a =, b &&) INDEX ; And the former could build the index implicitly. I haven't written the code yet, but I don't see any major problems. So, should I eliminate the latter syntax and only support the former, or should I support both? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 13:13 -0400, Tom Lane wrote: > You're right, it still seems remarkably marginal. I'm rethinking > my position on use of CONSTRAINT syntax because of the deferrability > issue, but I'm still unconvinced that we need to allow the constraints > to be decoupled from the indexes. Ok, should I explicitly disallow multiple constraints on one index then? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Sun, 2009-09-20 at 13:01 -0400, Tom Lane wrote: > The current infrastructure for deferred uniqueness requires that the > thing actually be a constraint, with an entry in pg_constraint that > can carry the deferrability options. So unless we want to rethink > that, this might be a sufficient reason to override my arguments > about not wanting to use CONSTRAINT syntax. Ok. Using the word EXCLUSION would hopefully guard us against future changes to SQL, but you know more about the subtle dangers of language changes than I do. So, do I still omit it from information_schema? > As far as implementation goes, I think there would be very little > choice but to use the insert-the-index-entry-first, check-later > approach; so your ideas involving extra state in shared memory > seem to fall to the ground anyhow. True. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sun, 2009-09-20 at 12:45 -0400, Tom Lane wrote: > How often have we had requests to add or drop UNIQUE in an existing > index? Maybe there were more than zero, but not by a lot. Ok. > As an example of why I don't believe the first item, consider something > like > create index ... (a = , b = ) > (or whatever the syntax is to exclude equality on each column > separately). Yeah, it will work, but have you considered the efficiency > implications? Searching such an index for b, independently of a, is > going to suck to such an extent that you'd be *far* better off building > two separate indexes. We do not have, and are unlikely ever to have, > index types in which a search that doesn't constrain the first index > column is efficient. My use case was something else: An index on (a, b, c) enforcing the constraints UNIQUE(a, b) and UNIQUE(a, c). UNIQUE(a, b) can be enforced efficiently. UNIQUE(a, c) might be less efficient depending on the selectivity of "a", but as long as "a" is selective I think it's useful. The alternative is updating two indices on every insert. You may still think this use case is too marginal to bother supporting, but I never made an argument for the use case you described above. If we move away from multiple constraints per index, are you suggesting that I also move the constraints out of pg_constraint and back into pg_index? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Sun, 2009-09-20 at 12:31 -0400, Tom Lane wrote: > > T1: inserts into index > > T2: inserts into index > > T1: checks index for conflicts, finds T2 > > T2: checks index for conflicts, finds T1 > > You get a deadlock failure, because both transactions will wait for each > other. So what? It's an error in any case, and you can get a reported > deadlock in constraint-enforcement scenarios today (conflicting FK > changes, for instance). Well, in theory, one of the transactions may have been destined to be aborted later anyway, and the deadlock detector might kill the wrong one. But I agree, perhaps I'm over-thinking this one. Aside: I just realized that my solution to the deadlock problem won't work with your simpler idea anyway. When reading the index we've long since lost the information about the specific insert of the specific command of the other transaction. I'll make the change. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sat, 2009-09-19 at 19:23 -0700, David Fetter wrote: > It just occurred to me that SQL:2008 ASSERTION might already fit this > feature. :) I think I would only be able to enforce very specific types of assertions that match the template. As I said to Robert, I think I'm going to use ALTER INDEX for the syntax because it appears to be the path of least resistance. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: generalized index constraints
On Sat, 2009-09-19 at 23:15 -0400, Robert Haas wrote: > I was wondering if we couldn't introduce a dummy tuple name similar to > OLD and NEW, called, say, OTHER. Then instead of writing a =, you > could write a = OTHER.a ... or perhaps a = OTHER.b ... although that > might also open the door to more things than you want to support at > this point. Interesting idea. At this point though, there is enough disagreement over the language that I just want to take the least-invasive route that has the lowest chance of causing a problem. It looks like ALTER INDEX might be the path of least resistance. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints [was: generalized index constraints]
On Sat, 2009-09-19 at 18:35 -0400, Tom Lane wrote: > I'm still acutely uncomfortable with using CONSTRAINT syntax for this. > It is not a constraint per standard, because it's not going to be > displayable in information_schema. Furthermore, by extending > standardized syntax you run the risk of being blindsided by future > additions to the standard. Ok. > The point about being able to support multiple constraints with one > index is kind of interesting, but I don't actually think that that's > so useful that it should override all other considerations about what > syntax we should pick. I think we should drop the whole thing and > just treat this as an extension to the CREATE INDEX syntax. Perhaps ALTER INDEX ADD EXCLUSION CONSTRAINT or some other command? And CREATE INDEX can offer the ability as a shorthand? I would still really like to decouple this from CREATE INDEX because of two reasons: 1. Cannot support multiple constraints per index very easily. I think this is a significant feature. 2. Must decide to make constraint at the same time as making the index, and once it's there, you can't remove it without dropping the index. I think either of these still tie the concept to implementation, because creating the index is always explicit. Peter seemed concerned about that, and I think that concern is valid, but I can live with it. If we really want them to be declarative, we could invent a new command. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers