Re: [HACKERS] record identical operator - Review
On Thu, Oct 3, 2013 at 10:46:05AM -0700, Kevin Grittner wrote: opcname - varchar_ops kd_point_ops cidr_ops text_pattern_ops varchar_pattern_ops bpchar_pattern_ops (6 rows) Do these all have operators defined too? Every operator class is associated with operators. For example, while text_pattern_ops uses the same = and operators as the default text opclass (because that already uses memcmp), it adds ~~, ~=~, ~~, and ~=~ operators which also use memcmp (ignoring character encoding and collation). OK, my questions have been answered and I am no longer concerned about this patch causing equality confusion for our users. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On 09/30/2013 09:08 AM, Kevin Grittner wrote: Steve Singer st...@ssinger.info wrote: How about To support matching of rows which include elements without a default B-tree operator class, the following operators are defined for composite type comparison: literal*=/, literal*lt;gt;/, literal*lt;/, literal*lt;=/, literal*gt;/, and literal*gt;=/. These operators compare the internal binary representation of the two rows. Two rows might have a different binary representation even though comparisons of the two rows with the equality operator is true. The ordering of rows under these comparision operators is deterministic but not otherwise meaningful. These operators are used internally for materialized views and might be useful for other specialized purposes such as replication but are not intended to be generally useful for writing queries. I agree that's an improvement. Thanks! Are there any outstanding issues on this patch preventing it from being committed? I think we have discussed this patch enough such that we now have consensus on proceeding with adding a record identical operator to SQL. No one has objected to the latest names of the operators. You haven't adjusted the patch to reduce the duplication between the equality and comparison functions, if you disagree with me and feel that doing so would increase the code complexity and be inconsistent with how we do things elsewhere that is fine. Steve -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On Thu, Oct 3, 2013 at 09:59:03AM -0400, Steve Singer wrote: Are there any outstanding issues on this patch preventing it from being committed? I have not received answers to my email of October 1: http://www.postgresql.org/message-id/20131001024620.gb13...@momjian.us -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: I'm wary of inventing a completely new way of doing this. I don't think that there's any guarantee that the send/recv functions won't expose exactly the same implementation details as a direct check for binary equality. I don't follow this thought. Changing the binary representation which is returned when users use the binary-mode protocol would be a pretty massive user-impacting change, while adding a new way of storing NUMERIC internally wouldn't even be visible to end users. For example, array_send() seems to directly reveal the presence or absence of a NULL bitmap. That's part of the definition of what the binary protocol for array *is* though, so that's simply a fact of life for our users. That doesn't mean we can't, say, change the array header to remove the internal type OID and use a mapping from the type OID that's on the tuple to the type OID inside the array- as long as array_send() still produces the same binary structure for the end user. Even if there were no such anomalies today, it feels fragile to rely on a fairly-unrelated concept to have exactly the semantics we want here, and it will surely be much slower. I agree that it would be slower but performance should be a consideration once correctness is accomplished and this distinction feels a great deal more correct, imv. Binary equality has existing precedent and is used in numerous places in the code for good reason. Users might be confused about the use of those semantics in those places also, but AFAICT nobody is. You've stated that a few times and I've simply not had time to run down the validity of it- so, where does internal-to-PG binary equality end up being visible to our users? Independent of that, are there places in the backend which could actually be refactored to use these new operators where it would reduce code complexity? On the other hand, if you are *replicating* those data types, then you don't want that tolerance. If you're checking whether two boxes are equal, you may indeed want the small amount of fuzziness that our comparison operators allow. But if you're copying a box or a float from one table to another, or from one database to another, you want the values copied exactly, including all of those low-order bits that tend to foul up your comparisons. That's why float8out() normally doesn't display any extra_float_digits - because you as the user shouldn't be relying on them - but pg_dump does back them up because not doing so would allow errors to propagate. Similarly here. I agree that we should be copying the values exactly- and I think we're already good there when it comes to doing a *copy*. I further agree that updating the matview should be a copy, but the manner in which we're doing that is using an equality check to see if the value needs to be updated or not which is where things get a bit fuzzy. If we were consistently copying and updating the value based on some external knowledge that the value has changed (similar to how slony works w/ triggers that dump change sets into a table- it doesn't consider has any value on this row changed?; the user did an update, presumably for some purpose, therefore the change gets recorded and propagated), I'd be perfectly happy. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator - Review
Steve, Thanks for following-up on this; I had meant to reply much sooner but other things got in the way. Thanks again, Stephen * Steve Singer (st...@ssinger.info) wrote: Are there any outstanding issues on this patch preventing it from being committed? I think we have discussed this patch enough such that we now have consensus on proceeding with adding a record identical operator to SQL. No one has objected to the latest names of the operators. You haven't adjusted the patch to reduce the duplication between the equality and comparison functions, if you disagree with me and feel that doing so would increase the code complexity and be inconsistent with how we do things elsewhere that is fine. signature.asc Description: Digital signature
Re: [HACKERS] record identical operator - Review
Bruce Momjian br...@momjian.us wrote: On Fri, Sep 27, 2013 at 03:34:20PM -0700, Kevin Grittner wrote: We first need to document the existing record comparison operators. If they read the docs for comparing row_constructors and expect that to be the behavior they get when they compare records, they will be surprised. Well, if they appear in \do, I am thinking they should be documented. This patch now covers the ones which are record comparison operators, old and new. Feel free to document the others if you feel strongly on that point; but I don't feel that becomes the business of this patch. Because comparing primary keys doesn't tell us whether the old and new values in the row all match. OK, but my question was about why we need a full set of operators rather than just equal, and maybe not equal. I thought you said we needed others, e.g. , so we could do merge joins, but I thought we would just be doing comparisons after primary keys are joined, and if that is true, we could just use a function. http://www.postgresql.org/docs/current/interactive/xoper-optimization.html#AEN54334 Actually, I am now realizing you have to use the non-binary-level equals comparison on keys, then the binary-level equals on rows for this to work --- that's pretty confusing. Is that true? It's a matter of replacing the = operator for record comparisons in these two places with the new *= operator. http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/matview.c;h=238ccc72f5205ae00a15e6e17f384addfa445552;hb=master#l553 http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/matview.c;h=238ccc72f5205ae00a15e6e17f384addfa445552;hb=master#l684 (With or without the new operator, the second of these needs to be schema-qualified to avoid trouble if the user adds a different implementation of = ahead of the pg_catalog implementation on search_path, as users can do.) The difference between = and *= would not generally be visible to end users -- just to those working on matview.c. A quick query (lacking schema information and schema qualification) shows what is there by default: OK, the unique list is: opcname - varchar_ops kd_point_ops cidr_ops text_pattern_ops varchar_pattern_ops bpchar_pattern_ops (6 rows) Do these all have operators defined too? Every operator class is associated with operators. For example, while text_pattern_ops uses the same = and operators as the default text opclass (because that already uses memcmp), it adds ~~, ~=~, ~~, and ~=~ operators which also use memcmp (ignoring character encoding and collation). -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
Steve Singer st...@ssinger.info wrote: You haven't adjusted the patch to reduce the duplication between the equality and comparison functions, if you disagree with me and feel that doing so would increase the code complexity and be inconsistent with how we do things elsewhere that is fine. I think the main reason to keep them separate is that it makes it easier to compare record_cmp to record_image_cmp and record_eq to record_image_eq to see what the differences and similarities are. Other reasons are that I think all those conditionals inside a combined function would get messy and make the logic harder to understand. The number of places that would need conditionals, plus new wrappers that would be needed, would mean that the net reduction in lines of code would be minimal. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Thu, Oct 3, 2013 at 11:12 AM, Stephen Frost sfr...@snowman.net wrote: Binary equality has existing precedent and is used in numerous places in the code for good reason. Users might be confused about the use of those semantics in those places also, but AFAICT nobody is. You've stated that a few times and I've simply not had time to run down the validity of it- so, where does internal-to-PG binary equality end up being visible to our users? Independent of that, are there places in the backend which could actually be refactored to use these new operators where it would reduce code complexity? Well, the most obvious example is HOT. README.HOT sayeth: % The requirement for doing a HOT update is that none of the indexed % columns are changed. This is checked at execution time by comparing the % binary representation of the old and new values. We insist on bitwise % equality rather than using datatype-specific equality routines. The % main reason to avoid the latter is that there might be multiple notions % of equality for a datatype, and we don't know exactly which one is % relevant for the indexes at hand. We assume that bitwise equality % guarantees equality for all purposes. That bit about multiple notions of equality applies here too - because we don't know what the user will do with the data in the materialized view, Kevin's looking for an operator which guarantees equality for all purposes. Note that HOT could feed everything through the send and recv functions, as you are proposing here, and that would allow HOT to apply in cases where it does not currently apply. We could, for example, perform a HOT update when a value in a numeric column is changed from the long format to the short format without changing the user-perceived value. You could argue that HOT isn't user-visible, but we certainly advise people to think about structuring their indexing in a fashion that does not defeat HOT, so I think to some extent it is user-visible. Also, in xfunc.sgml, we have this note: The planner also sometimes relies on comparing constants via bitwise equality, so you can get undesirable planning results if logically-equivalent values aren't bitwise equal. There are other places as well. If two node trees are compared using equal(), any Const nodes in that tree will be compared for binary equality. So for example MergeWithExistingConstraint() will error out if the constraints are equal under btree equality operators but not binary equal. equal() is also used in various places in the planner, which may be the reason for the above warning. The point I want to make here is that we have an existing precedent to use bitwise equality when we want to make sure that values are equivalent for all purposes, regardless of what opclass or whatever is in use. There are not a ton of those places but there are some. On the other hand, if you are *replicating* those data types, then you don't want that tolerance. If you're checking whether two boxes are equal, you may indeed want the small amount of fuzziness that our comparison operators allow. But if you're copying a box or a float from one table to another, or from one database to another, you want the values copied exactly, including all of those low-order bits that tend to foul up your comparisons. That's why float8out() normally doesn't display any extra_float_digits - because you as the user shouldn't be relying on them - but pg_dump does back them up because not doing so would allow errors to propagate. Similarly here. I agree that we should be copying the values exactly- and I think we're already good there when it comes to doing a *copy*. I further agree that updating the matview should be a copy, but the manner in which we're doing that is using an equality check to see if the value needs to be updated or not which is where things get a bit fuzzy. If we were consistently copying and updating the value based on some external knowledge that the value has changed (similar to how slony works w/ triggers that dump change sets into a table- it doesn't consider has any value on this row changed?; the user did an update, presumably for some purpose, therefore the change gets recorded and propagated), I'd be perfectly happy. Sure, that'd work, but it doesn't explain what's wrong with Kevin's proposal. You're basically saying that memcpy(a, b, len) is OK with you but if (memcmp(a, b, len) != 0) memcpy(a, b, len) is not OK with you. I don't understand how you can endorse copying the value exactly, but not be OK with the optimization that says, well if it already matches exactly, then we don't need to copy it. We can certainly rip out the current implementation of REFRESH MATERIALIZED VIEW CONCURRENTLY and replace it with something that deletes every row in the view and reinserts them all, but it will be far less efficient than what we have now. All that is anybody is asking for here is the
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: If we were consistently copying and updating the value based on some external knowledge that the value has changed (similar to how slony works w/ triggers that dump change sets into a table- it doesn't consider has any value on this row changed?; the user did an update, presumably for some purpose, therefore the change gets recorded and propagated), I'd be perfectly happy. That is the equivalent of the incremental maintenance feature which won't be coming until after REFRESH has settled down. REFRESH is more like the slony initial population of a table. REFRESH CONCURRENTLY is a way of doing that which allows users to continue to read from the matview without blocking or interruption while the REFRESH is running; I think it would be a bug if that could generate different results from a non-CONCURRENT REFRESH. Wanting incremental maintenance (which we all want) is no reason to block an earlier implementation of REFRESH CONCURRENTLY which is not, and does not use, incremental maintenance. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Thu, Oct 3, 2013 at 10:53 AM, Robert Haas robertmh...@gmail.com wrote: The point I want to make here is that we have an existing precedent to use bitwise equality when we want to make sure that values are equivalent for all purposes, regardless of what opclass or whatever is in use. There are not a ton of those places but there are some. Btree opclasses (which are of course special, per 35.14.6. System Dependencies on Operator Classes) are restricted by the reflexive law, which implies that bitwise equality is a stronger condition than regularly equality. I'm inclined to agree that it isn't a big deal to expose a bitwise equality operator. We're talking about the possibility of being potentially overly eager according to someone's definition about refreshing, not the opposite. With reference to an actual case where this is possible, I don't think this is astonishing, though I grant Stephen that that's aesthetic. I also agree that an intermediate notion of equality isn't worth it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: You could argue that HOT isn't user-visible, but we certainly advise people to think about structuring their indexing in a fashion that does not defeat HOT, so I think to some extent it is user-visible. I do think saying HOT is user-visible is really stretching things and do we really say somewhere please be careful to make sure your updates to key fields are *BINARY IDENTICAL* to what's stored or HOT won't be used? If anything, that should be listed on a PG 'gotchas' page. Also, in xfunc.sgml, we have this note: The planner also sometimes relies on comparing constants via bitwise equality, so you can get undesirable planning results if logically-equivalent values aren't bitwise equal. And that would be under C-Language Functions, which also includes things like take care to zero out any alignment padding bytes that might be present in structs. There are other places as well. If two node trees are compared using equal(), any Const nodes in that tree will be compared for binary equality. These would primairly be cases where we've created a Const out of a string, or similar, which the *user provided*, no? It strikes me as at least unlikely that we'd end up storing a given string from the user in different ways in memory and so this consideration, again, makes sense for people writing C code but not for your general SQL user. So for example MergeWithExistingConstraint() will error out if the constraints are equal under btree equality operators but not binary equal. equal() is also used in various places in the planner, which may be the reason for the above warning. I wonder if this would need to be changed because you could actually define constraints that operate at a binary level and therefore don't overlap even though they look like they overlap based on btree equality. The point I want to make here is that we have an existing precedent to use bitwise equality when we want to make sure that values are equivalent for all purposes, regardless of what opclass or whatever is in use. There are not a ton of those places but there are some. I agree that there are some cases and further that these operators provide a way of saying are these definitely the same? but they fall down on are these definitely different? That makes these operators useful for these kinds of optimizations, but that's it. Providing SQL level optimization-only operators like this is akin to the SQL standard defining indexes. Sure, that'd work, but it doesn't explain what's wrong with Kevin's proposal. You're basically saying that memcpy(a, b, len) is OK with you but if (memcmp(a, b, len) != 0) memcpy(a, b, len) is not OK with you. I don't understand how you can endorse copying the value exactly, but not be OK with the optimization that says, well if it already matches exactly, then we don't need to copy it. Adding new operators isn't all about what happens at the C-code level. That said, I agree that PG, in general, is more 'open' to exposing implementation details than is perhaps ideal, but it can also be quite useful in some instances. I don't really like doing that in top-level operators like this, but it doesn't seem like there's a whole lot of help for it. I'm not convinced that using the send/recv approach would be all that big of a performance hit but I've not tested it. We can certainly rip out the current implementation of REFRESH MATERIALIZED VIEW CONCURRENTLY and replace it with something that deletes every row in the view and reinserts them all, but it will be far less efficient than what we have now. All that is anybody is asking for here is the ability to skip deleting and reinserting rows that are absolutely identical in the old and new versions of the view. If this was an entirely internal thing, it'd be different, but it's not. Your send/recv proposal would let us also skip deleting and reinserting rows that are ALMOST identical except for not-normally-user-visible binary format differences... but since we haven't worried about allowing such cases for e.g. HOT updates, I don't think we need to worry about them here, either. In practice, such changes are rare as hen's teeth anyway. I'm not entirely convinced that what was done for HOT in this regard is a precedent we should be building on top of. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On 10/04/2013 12:22 AM, Stephen Frost wrote: That said, I agree that PG, in general, is more 'open' to exposing implementation details than is perhaps ideal, Every *real* system is more open to exposing implementation details than is *ideal*. One very popular implementation detail which surprises users over and over is performance under different use cases. There is no way you can hide this. That said, I see nothing bad in having an operator for binary equal or alternately called guaranteed equal. Its negator is not guaranteed unequal but not guaranteed to be equal The main exposed implementation detail of this operator is that it is very fast and can be recommended to be used at user level for speeding up equal query like this SELECT * FROM t WHERE guaranteed equal or equal where the plain equal will only be called when fast guaranteed equal fails. a bit similar to how you can cut down on index size on long text fields by indexing their hashes and then querying SELECT * FROM t WHERE hashtext(verylongtext) = hashtext(sample) AND verylongtext = sample It is absolutely possible that the fact that hash clashes exist also confuses some users the same way the possibility of having binary inequality and be still NOT DISTINCT FROM, but I argue that having a fast guaranteed equal operation available to users is useful. some users may also initially get confused by SELECT 3/2; returning 1 and not the right answer of 1.5. They may even bitch and moan that PostgreSQL is completely broken. But then they look it up or ask on the net and are confused no more. Greetings Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Hannu Krosing (ha...@krosing.net) wrote: The main exposed implementation detail of this operator is that it is very fast and can be recommended to be used at user level for speeding up equal query like this SELECT * FROM t WHERE guaranteed equal or equal where the plain equal will only be called when fast guaranteed equal fails. Yeah, this would be exactly the kind of misuse that we will need to be prepared to support with these new operators. If this is actually faster/better/whatever, then we should be implementing it in our conditional handling, not encouraging users to create hacks like this. a bit similar to how you can cut down on index size on long text fields by indexing their hashes and then querying SELECT * FROM t WHERE hashtext(verylongtext) = hashtext(sample) AND verylongtext = sample This case clearly requires a great deal more thought and consideration on the DBA's side and is also a lot more obvious what it's doing than having 'where x *= 123 or x = 123'. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator - Review
Steve Singer st...@ssinger.info wrote: How about To support matching of rows which include elements without a default B-tree operator class, the following operators are defined for composite type comparison: literal*=/, literal*lt;gt;/, literal*lt;/, literal*lt;=/, literal*gt;/, and literal*gt;=/. These operators compare the internal binary representation of the two rows. Two rows might have a different binary representation even though comparisons of the two rows with the equality operator is true. The ordering of rows under these comparision operators is deterministic but not otherwise meaningful. These operators are used internally for materialized views and might be useful for other specialized purposes such as replication but are not intended to be generally useful for writing queries. I agree that's an improvement. Thanks! -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On Fri, Sep 27, 2013 at 03:34:20PM -0700, Kevin Grittner wrote: The arguments for this patch are * We want the materialized view to return the same value as would be returned if the query were executed directly. This not only means that the values should be the same according to a datatypes = operator but that they should appear the same 'to the eyeball'. And to functions the user can run against the values. The example with the null bitmap for arrays being included when not necessary is something that isn't directly apparent to the eye, but queries which use pg_column_size would not get equal results. pg_column_size() is by definition internal details, so I am not worried about that not matching. * Supporting the materialized view refresh check via SQL makes a lot of sense but doing that requires exposing something via SQL * If we adequately document what we mean by record_image_identical and the operator we pick for this then users shouldn't be surprised at what they get if they use this We first need to document the existing record comparison operators. If they read the docs for comparing row_constructors and expect that to be the behavior they get when they compare records, they will be surprised. Well, if they appear in \do, I am thinking they should be documented. This is a good summary. I think there are a few things that make this issue difficult to decide: 1. We have to use an operator to give the RMVC (REFRESH MATERIALIZED VIEW CONCURRENTLY) SPI query the ability to optimize this query. If we could do this with an SQL or C function, there would be less concern about the confusion caused by adding this capability. (a) We can't. (b) Why would that be less confusing? Because function names, especially long ones, are more clearly self-documenting than operators. Question: If we are comparing based on some primary key, why do we need this to optimize? Because comparing primary keys doesn't tell us whether the old and new values in the row all match. OK, but my question was about why we need a full set of operators rather than just equal, and maybe not equal. I thought you said we needed others, e.g. , so we could do merge joins, but I thought we would just be doing comparisons after primary keys are joined, and if that is true, we could just use a function. Actually, I am now realizing you have to use the non-binary-level equals comparison on keys, then the binary-level equals on rows for this to work --- that's pretty confusing. Is that true? 3. Our type casting and operators are already complex, and adding another set of operators only compounds that. It cannot have any effect on any of the existing operators, so I'm not sure whether you are referring to the extra operators and functions, or something else. It does not, for example, introduce any risk of ambiguous operators. It makes our system more complex for the user to understand. One interesting approach would be to only allow the operator to be called from SPI queries. Why would that be a good idea? Because then it would be more of an internal operator. It would also be good to know about similar non-default entries in pg_opclass so we can understand the expected impact. A quick query (lacking schema information and schema qualification) shows what is there by default: OK, the unique list is: opcname - varchar_ops kd_point_ops cidr_ops text_pattern_ops varchar_pattern_ops bpchar_pattern_ops (6 rows) Do these all have operators defined too? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On 09/28/2013 03:03 PM, Kevin Grittner wrote: para +To support matching of rows which include elements without a default +B-tree operator class, the following operators are defined for composite +type comparison: +literal*=/, +literal*lt;gt;/, +literal*lt;/, +literal*lt;=/, +literal*gt;/, and +literal*gt;=/. +These operators are also used internally to maintain materialized views, +and may be useful to replication software. To ensure that all user +visible changes are detected, even when the equality operator for the +type treats old and new values as equal, the byte images of the stored +data are compared. While ordering is deterministic, it is not generally +useful except to facilitate merge joins. Ordering may differ between +system architectures and major releases of +productnamePostgreSQL/productname. + /para How about To support matching of rows which include elements without a default B-tree operator class, the following operators are defined for composite type comparison: literal*=/, literal*lt;gt;/, literal*lt;/, literal*lt;=/, literal*gt;/, and literal*gt;=/. These operators compare the internal binary representation of the two rows. Two rows might have a different binary representation even though comparisons of the two rows with the equality operator is true. The ordering of rows under these comparision operators is deterministic but not otherwise meaningful. These operators are used internally for materialized views and might be useful for other specialized purposes such as replication but are not intended to be generally useful for writing queries. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
Kevin Grittner kgri...@ymail.com wrote: Bruce Momjian br...@momjian.us wrote: I think we need to see a patch from Kevin that shows all the row comparisons documented and we can see the impact of the user-visibile part. First draft attached. I'm inclined to first submit a proposed documentation patch for the existing record operators, and see if we can make everyone happy with that, and *then* see about adding documentation for the new ones. Trying to deal with both at once is likely to increase confusion and wheel-spinning. When I took a stab at it, the new operators only seemed to merit a single paragraph, so that is included after all. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company*** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 12739,12745 WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2); para See xref linkend=row-wise-comparison for details about the meaning !of a row-wise comparison. /para /sect2 --- 12739,12745 para See xref linkend=row-wise-comparison for details about the meaning !of a row constructor comparison. /para /sect2 *** *** 12795,12806 WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2); para See xref linkend=row-wise-comparison for details about the meaning !of a row-wise comparison. /para /sect2 sect2 !titleRow-wise Comparison/title indexterm zone=functions-subquery primarycomparison/primary --- 12795,12806 para See xref linkend=row-wise-comparison for details about the meaning !of a row constructor comparison. /para /sect2 sect2 !titleSingle-row Comparison/title indexterm zone=functions-subquery primarycomparison/primary *** *** 12823,12829 WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2); para See xref linkend=row-wise-comparison for details about the meaning !of a row-wise comparison. /para /sect2 /sect1 --- 12823,12829 para See xref linkend=row-wise-comparison for details about the meaning !of a row constructor comparison. /para /sect2 /sect1 *** *** 12853,12864 WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2); /indexterm indexterm primaryrow-wise comparison/primary /indexterm indexterm primarycomparison/primary !secondaryrow-wise/secondary /indexterm indexterm --- 12853,12874 /indexterm indexterm +primarycomposite type/primary +secondarycomparison/secondary + /indexterm + + indexterm primaryrow-wise comparison/primary /indexterm indexterm primarycomparison/primary !secondarycomposite type/secondary ! /indexterm ! ! indexterm !primarycomparison/primary !secondaryrow constructor/secondary /indexterm indexterm *** *** 13023,13029 AND /sect2 sect2 id=row-wise-comparison !titleRow-wise Comparison/title synopsis replaceablerow_constructor/replaceable replaceableoperator/replaceable replaceablerow_constructor/replaceable --- 13033,13039 /sect2 sect2 id=row-wise-comparison !titleRow Constructor Comparison/title synopsis replaceablerow_constructor/replaceable replaceableoperator/replaceable replaceablerow_constructor/replaceable *** *** 13033,13052 AND Each side is a row constructor, as described in xref linkend=sql-syntax-row-constructors. The two row values must have the same number of fields. !Each side is evaluated and they are compared row-wise. Row comparisons !are allowed when the replaceableoperator/replaceable is literal=/, literallt;gt;/, literallt;/, literallt;=/, literalgt;/ or !literalgt;=/, !or has semantics similar to one of these. (To be specific, an operator !can be a row comparison operator if it is a member of a B-tree operator !class, or is the negator of the literal=/ member of a B-tree operator !class.) /para para The literal=/ and literallt;gt;/ cases work slightly differently from the others. Two rows are considered --- 13043,13067 Each side is a row constructor, as described in xref linkend=sql-syntax-row-constructors. The two row values must have the same number of fields. !Each side is evaluated and they are compared row-wise. Row constructor !comparisons are allowed when the replaceableoperator/replaceable is literal=/, literallt;gt;/, literallt;/, literallt;=/, literalgt;/ or !literalgt;=/. !Every row element must be of a type which has a default B-tree operator !class or the attempted comparison may generate an error. /para + note +para + Errors related to the number or types of elements might not
Re: [HACKERS] record identical operator - Review
Bruce Momjian br...@momjian.us wrote: On Thu, Sep 19, 2013 at 06:31:38PM -0400, Steve Singer wrote: On 09/12/2013 06:27 PM, Kevin Grittner wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. Here is attempt at a review of the v1 patch. There has been a heated discussion on list about if we even want this type of operator. I will try to summarize it here The arguments against it are * that a record_image_identical call on two records that returns false can still return true under the equality operator, and the records might (or might not) appear to be the same to users * Once we expose this as an operator via SQL someone will find it and use it and then complain when we change things such as the internal representation of a datatype which might break there queries I don't see where that is possible unless they count on the order of records sorted with the new operators, which would not be something I recommend. Changing the internal representation cannot break simple use of the alternative equality definition as long as all you cared about was whether the binary storage format of the values was identical. * The differences between = and record_image_identical might not be understood by everywhere who finds the operator leading to confusion Either they find it and read the code, or we document it and they read the docs. * Using a criteria other than equality (the = operator provided by the datatype) might cause problems if we later provide 'on change' triggers for materialized views I would fight user-defined triggers for matviews tooth and nail. We need to get incremental maintenance working instead. The arguments for this patch are * We want the materialized view to return the same value as would be returned if the query were executed directly. This not only means that the values should be the same according to a datatypes = operator but that they should appear the same 'to the eyeball'. And to functions the user can run against the values. The example with the null bitmap for arrays being included when not necessary is something that isn't directly apparent to the eye, but queries which use pg_column_size would not get equal results. * Supporting the materialized view refresh check via SQL makes a lot of sense but doing that requires exposing something via SQL * If we adequately document what we mean by record_image_identical and the operator we pick for this then users shouldn't be surprised at what they get if they use this We first need to document the existing record comparison operators. If they read the docs for comparing row_constructors and expect that to be the behavior they get when they compare records, they will be surprised. This is a good summary. I think there are a few things that make this issue difficult to decide: 1. We have to use an operator to give the RMVC (REFRESH MATERIALIZED VIEW CONCURRENTLY) SPI query the ability to optimize this query. If we could do this with an SQL or C function, there would be less concern about the confusion caused by adding this capability. (a) We can't. (b) Why would that be less confusing? Question: If we are comparing based on some primary key, why do we need this to optimize? Because comparing primary keys doesn't tell us whether the old and new values in the row all match. Certainly the primary key index will be used, no? It currently uses all columns which are part of unique indexes on the matview which are not partial (i.e., they do not use a WHERE clause), they index only on columns (not expressions). It is not possible to define a primary key on a matview. There are two reasons for using these columns: (a) It provides a correctness guarantee against having duplicated rows which have no nulls. This guarantee is what allows us to use the differential technique which is faster than retail DELETE and re-INSERT of everything. (b) Since we don't support hash joins of records, and it would tend to be pretty slow if we did, it allows faster hash joins on one or more columns which stand a good chance of doing this efficiently. 2. Agregates are already non-deterministic, Only some of them, and only with certain source data. so some feel that adding this feature doesn't improve much. It allows a materialized view which contains a column of a type without a default btree opclass to *use* concurrent refresh, it makes queries with deterministic results always generate matview results with exactly match the results of a VIEW using the query if it were run at the same moment, and for nondeterministic queries, it provides results consistent with *some* result set which could have been returned by a view at the same time. Unless we do something, none of these things are true. That seems like much to me. The counter-argument is that
Re: [HACKERS] record identical operator
On Tue, Sep 24, 2013 at 3:22 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: Now I admit that's an arguable point. We could certainly define an intermediate notion of equality that is more equal than whatever = does, but not as equal as exact binary equality. I suggested it up-thread and don't recall seeing a response, so here it is again- passing the data through the binary-out function for the type and comparing *that* would allow us to change the interal binary representation of data types and would be something which we could at least explain to users as to why X isn't the same as Y according to this binary operator. Sorry, I missed that suggestion. I'm wary of inventing a completely new way of doing this. I don't think that there's any guarantee that the send/recv functions won't expose exactly the same implementation details as a direct check for binary equality. For example, array_send() seems to directly reveal the presence or absence of a NULL bitmap. Even if there were no such anomalies today, it feels fragile to rely on a fairly-unrelated concept to have exactly the semantics we want here, and it will surely be much slower. Binary equality has existing precedent and is used in numerous places in the code for good reason. Users might be confused about the use of those semantics in those places also, but AFAICT nobody is. I think the best thing I can say about this proposal is that it would clearly be better than what we're doing now, which is just plain wrong. I don't think it's the best proposal, however. It is obviously true, and unavoidable, that if the query can produce more than one result set depending on the query plan or other factors, then the materialized view contents can match only one of those possible result sets. But you are arguing that it should be allowed to produce some OTHER result set, that a direct execution of the query could *never* have produced, and I can't see how that can be right. I agree that the matview shouldn't produce things which *can't* exist through an execution of the query. I don't intend to argue against that but I realize that's a fallout of not accepting the patch to implement the binary comparison operators. My complaint is more generally that if this approach needs such then there's going to be problems down the road. No, I can't predict exactly what they are and perhaps I'm all wet here, but this kind of binary-equality operations are something I've tried to avoid since discovering what happens when you try to compare a couple of floating point numbers. So, I get that. But what I think is that the problem that's coming up here is almost the flip side of that. If you are working with types that are a little bit imprecise, such as floats or citext or box, you want to use comparison strategies that have a little bit of tolerance for error. In the case of box, this is actually built into the comparison operator. In the case of floats, it's not; you as the application programmer have to deal with the fact that comparisons are imprecise - like by avoiding equality comparisons. On the other hand, if you are *replicating* those data types, then you don't want that tolerance. If you're checking whether two boxes are equal, you may indeed want the small amount of fuzziness that our comparison operators allow. But if you're copying a box or a float from one table to another, or from one database to another, you want the values copied exactly, including all of those low-order bits that tend to foul up your comparisons. That's why float8out() normally doesn't display any extra_float_digits - because you as the user shouldn't be relying on them - but pg_dump does back them up because not doing so would allow errors to propagate. Similarly here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On Thu, Sep 19, 2013 at 06:31:38PM -0400, Steve Singer wrote: On 09/12/2013 06:27 PM, Kevin Grittner wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. Here is attempt at a review of the v1 patch. There has been a heated discussion on list about if we even want this type of operator. I will try to summarize it here The arguments against it are * that a record_image_identical call on two records that returns false can still return true under the equality operator, and the records might (or might not) appear to be the same to users * Once we expose this as an operator via SQL someone will find it and use it and then complain when we change things such as the internal representation of a datatype which might break there queries * The differences between = and record_image_identical might not be understood by everywhere who finds the operator leading to confusion * Using a criteria other than equality (the = operator provided by the datatype) might cause problems if we later provide 'on change' triggers for materialized views The arguments for this patch are * We want the materialized view to return the same value as would be returned if the query were executed directly. This not only means that the values should be the same according to a datatypes = operator but that they should appear the same 'to the eyeball'. * Supporting the materialized view refresh check via SQL makes a lot of sense but doing that requires exposing something via SQL * If we adequately document what we mean by record_image_identical and the operator we pick for this then users shouldn't be surprised at what they get if they use this This is a good summary. I think there are a few things that make this issue difficult to decide: 1. We have to use an operator to give the RMVC (REFRESH MATERIALIZED VIEW CONCURRENTLY) SPI query the ability to optimize this query. If we could do this with an SQL or C function, there would be less concern about the confusion caused by adding this capability. Question: If we are comparing based on some primary key, why do we need this to optimize? Certainly the primary key index will be used, no? 2. Agregates are already non-deterministic, so some feel that adding this feature doesn't improve much. The counter-argument is that without the binary row comparison ability, rows could be returned that could _never_ have been produced by the base data, which is more of a user surprise than non-deterministic rows. 3. Our type casting and operators are already complex, and adding another set of operators only compounds that. 4. There are legitimate cases where tool makers might want the ability to compare rows binarily, e.g. for replication, and adding these operators would help with that. I think we need to see a patch from Kevin that shows all the row comparisons documented and we can see the impact of the user-visibile part. One interesting approach would be to only allow the operator to be called from SPI queries. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On Thu, Sep 26, 2013 at 05:50:08PM -0400, Bruce Momjian wrote: I think we need to see a patch from Kevin that shows all the row comparisons documented and we can see the impact of the user-visibile part. One interesting approach would be to only allow the operator to be called from SPI queries. It would also be good to know about similar non-default entries in pg_opclass so we can understand the expected impact. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Kevin Grittner kgri...@ymail.com wrote: Here is v2 of the patch which changes from the universally disliked operator names v1 used. It also fixes bugs in the row comparisons for pass-by-reference types, fixes a couple nearby comments, and adds regression tests for a matview containing a box column. I accidentally omitted the regression tests for matview with a box column from v2. Rather than redoing the whole thing, here is a v2a patch to add just that omitted part. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company*** a/src/test/regress/expected/matview.out --- b/src/test/regress/expected/matview.out *** *** 412,418 ERROR: new data for mv contains duplicate rows without any NULL columns DETAIL: Row: (1,10) DROP TABLE foo CASCADE; NOTICE: drop cascades to materialized view mv ! -- make sure that all indexes covered by unique indexes works CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3); CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo; CREATE UNIQUE INDEX ON mv (a); --- 412,418 DETAIL: Row: (1,10) DROP TABLE foo CASCADE; NOTICE: drop cascades to materialized view mv ! -- make sure that all columns covered by unique indexes works CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3); CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo; CREATE UNIQUE INDEX ON mv (a); *** *** 424,426 REFRESH MATERIALIZED VIEW mv; --- 424,446 REFRESH MATERIALIZED VIEW CONCURRENTLY mv; DROP TABLE foo CASCADE; NOTICE: drop cascades to materialized view mv + -- make sure that types with unusual equality tests work + CREATE TABLE boxes (id serial primary key, b box); + INSERT INTO boxes (b) VALUES + ('(32,32),(31,31)'), + ('(2.004,2.004),(1,1)'), + ('(1.996,1.996),(1,1)'); + CREATE MATERIALIZED VIEW boxmv AS SELECT * FROM boxes; + CREATE UNIQUE INDEX boxmv_id ON boxmv (id); + UPDATE boxes SET b = '(2,2),(1,1)' WHERE id = 2; + REFRESH MATERIALIZED VIEW CONCURRENTLY boxmv; + SELECT * FROM boxmv ORDER BY id; + id | b + +- + 1 | (32,32),(31,31) + 2 | (2,2),(1,1) + 3 | (1.996,1.996),(1,1) + (3 rows) + + DROP TABLE boxes CASCADE; + NOTICE: drop cascades to materialized view boxmv *** a/src/test/regress/sql/matview.sql --- b/src/test/regress/sql/matview.sql *** *** 143,149 REFRESH MATERIALIZED VIEW mv; REFRESH MATERIALIZED VIEW CONCURRENTLY mv; DROP TABLE foo CASCADE; ! -- make sure that all indexes covered by unique indexes works CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3); CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo; CREATE UNIQUE INDEX ON mv (a); --- 143,149 REFRESH MATERIALIZED VIEW CONCURRENTLY mv; DROP TABLE foo CASCADE; ! -- make sure that all columns covered by unique indexes works CREATE TABLE foo(a, b, c) AS VALUES(1, 2, 3); CREATE MATERIALIZED VIEW mv AS SELECT * FROM foo; CREATE UNIQUE INDEX ON mv (a); *** *** 154,156 INSERT INTO foo VALUES(3, 4, 5); --- 154,169 REFRESH MATERIALIZED VIEW mv; REFRESH MATERIALIZED VIEW CONCURRENTLY mv; DROP TABLE foo CASCADE; + + -- make sure that types with unusual equality tests work + CREATE TABLE boxes (id serial primary key, b box); + INSERT INTO boxes (b) VALUES + ('(32,32),(31,31)'), + ('(2.004,2.004),(1,1)'), + ('(1.996,1.996),(1,1)'); + CREATE MATERIALIZED VIEW boxmv AS SELECT * FROM boxes; + CREATE UNIQUE INDEX boxmv_id ON boxmv (id); + UPDATE boxes SET b = '(2,2),(1,1)' WHERE id = 2; + REFRESH MATERIALIZED VIEW CONCURRENTLY boxmv; + SELECT * FROM boxmv ORDER BY id; + DROP TABLE boxes CASCADE; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-23 21:21:53 -0400, Stephen Frost wrote: Here's an example to illustrate what I'm talking about when it comes down to you can't claim that you'll produce exactly what the query will always, with these types: =# table citext_table; id | name +--- 1 | one 3 | three 5 | 4 | Three 2 | Three (5 rows) =# create MATERIALIZED VIEW citext_matview AS select name from citext_table where id 0 group by name; SELECT 3 =# table citext_matview; name --- one three (3 rows) I don't really understand why you have a problem with this specific thing here. Kevin's goal seems to be to make materialized views behave consistent with the way a plain view would if the matviews would just have been refreshed fully, concurrently or incrementally and there have been no further data changes. SELECT * FROM table WHERE candidate_key IN (...); used in a view or plainly currently guarantees you that you get the original casing for citext. And if you e.g. have some additional expensive joins, such a matview can very well make sense. What does it matter that ... GROUP BY citext_col; doesn't return the same casing consistently? You're aggregating, not accessing the original data. If you need to separate the different casings now, cast to text. Now, I can perfectly understand having a bit of architectural qualms about Kevin's approach of things on the SQL level. But this doesn't seem to be the thread to discuss that. FWIW, I can't forsee any realistic approach that actually won't do such comparisons (although not necessarily through C). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/23/2013 10:38 PM, Stephen Frost wrote: and heavily caveated. I'm not sure what caveats would be needed. It seems to me that a clear description of what it does would suffice. Like all the other non-default opclasses in core, it will be non-default because it is less frequently useful. This will claim things are different, even when they aren't different when cast to text, or possibly even when extracted in binary mode, ensure this is really what you want is a pretty big caveat, imv. Yes, it should be documented that it tests for sameness and gives no guarantees that lack of sameness means different (as determined by some other operator) -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: Your example demonstrates that if a given query can generate two different outputs, A and B, based on the same input data, then the contents of the materialized view cannot be equal to be A and also equal to B. Well, no duh. Two different outputs based on what *plan* is chosen. Of course, you don't need citext, or any other data type with a loose notion of equality, to generate that sort of problem: [...] rhaas=# set datestyle = 'german'; SET I'm talking about *planner differences* changing the results. If we've got a ton of cases where a different plan means different output, then we've got some serious problems. I'd argue that it's pretty darn clear that datestyle is going to be a *slightly* different animal. My example doesn't *require* changing any GUCs, it was just expedient for illustration. But I'm still wondering what this is intended to prove. These types are, to those that use them at least, a known quantity wrt what you get when using them in GROUP BYs, JOINs, etc. You're trying to 'fix' something that isn't really broken and you're only doing it half-baked anyway because your 'fix' isn't going to actually make these types produce consistent results. There are an infinite number of ways to write queries that produce different results, and I think we all know that materialized views aren't going to hold up very well if given such queries. That seems a poor excuse for not fixing the cases that can be made to work. New operators are not without cost, I don't think it's a good idea to expose our internal binary representations of data out to the SQL level, and the justification for adding them is that they solve a case that they don't. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: Skipping out on much of the side-discussion to try and drive at the heart of this.. Robert Haas (robertmh...@gmail.com) wrote: I would suggest that you read the referenced papers for details as to how the algorithm works. To make a long story short, they do need to keep track of what's changed, and how. I suppose it's good to know that I wasn't completely misunderstanding the discussion in Ottawa. However, that still seems largely orthogonal to the present discussion. It *solves* this issue, from where I'm sitting, without any binary operators at all. [ argument that the only way we should ever do REFRESH is by using captured deltas, through incremental maintenance techniques ] That would ensure that a query could not be used to define a matview until we had implemented incremental maintenance for queries of that type and complexity. I expect that to come *close* to covering all useful queries that way will take five to ten years, if all goes well. The approach I'm taking makes all queries available *now*, with improvements in how many can be maintained incrementally improving over time. This is the route every other database I've looked at has taken (for good reason, I think). Ultimately, even when we have incremental maintenance supported for all queries that can be used to define a matview, I think there will be demand for a REFRESH command that re-runs the base query. Not only does that fit the workload for some matviews, but consider this, from the paper I cited[1]: | Recomputing the view from scratch is too wasteful in most cases. | Using the heuristic of inertia (only a part of the view changes | in response to changes in the base relations), it is often | cheaper to compute only the changes in the view. We stress that | the above is only a heuristic. For example, if an entire base | relation is deleted, it may be cheaper to recompute a view that | depends on the deleted relation (if the new view will quickly | evaluate to an empty relation) than to compute the changes to the | view. What we're talking about is a performance heuristic -- not something more profound than that. The route I'm taking is to get it *working correctly now* using the simple technique, and then embarking on the long journey of optimizing progressively more cases. What your argument boils down to IMV is essentially a case of premature optimization. You have yet to show any case where the existing patch does not yield correct results, or show that there is a better way to get to the point this patch takes us. [ later post shows a query that does not produce deterministic results ] Sure, two direct runs of that same query, or two runs through a regular view, could show different results (considering synchronized scans, among other things). I don't see what that proves. Of course a refresh of a matview is only going to produce one of those and then will not produce a different result until it is refreshed or (once we add incremental maintenance) something changes in the underlying data. Nobody ever claimed that a query which does not produce consistent results would somehow produce them with this patch. There are queries using citext, numeric, and other types which *do* provide consistent results which are consistently produced by a straight query, a simple view, or a non-concurrent refresh of a materialized view; this patch will cause a concurrent refresh to produce the same results as those, rather than something different. Period. That's the point, and the whole point. You have not shown that it doesn't. You have not shown why adding a 12th non-default opclass is a particular problem here (although we have a consensus to use different operators, to reserve this operator namespace for other things). You have not made any case at all for why people should wait for incremental maintenance to be mature (a project which will take years) before being able to use materialized views with concurrent refreshes. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] A. Gupta, I. S. Mumick, and V. S. Subrahmanian. Maintaining Views Incrementally. In SIGMOD 1993, pages 157-167. http://dl.acm.org/citation.cfm?id=170066 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Tue, Sep 24, 2013 at 7:14 AM, Stephen Frost sfr...@snowman.net wrote: Of course, you don't need citext, or any other data type with a loose notion of equality, to generate that sort of problem: [...] rhaas=# set datestyle = 'german'; SET I'm talking about *planner differences* changing the results. If we've got a ton of cases where a different plan means different output, then we've got some serious problems. I'd argue that it's pretty darn clear that datestyle is going to be a *slightly* different animal. My example doesn't *require* changing any GUCs, it was just expedient for illustration. I'm completely confused, here. What's so special about planner differences? Obviously, there are tons of queries that can produce different results based on planner differences, but materialized views didn't create that problem and they aren't responsible for solving it. Also, even restricting ourselves to planner differences, there's no particular link between the behavior of the type's equality operator and whether or not the query always produces the same results. For example, let's consider good old text. rhaas=# create table tag_data (id integer, tag text, userid text, primary key (id, tag)); CREATE TABLE rhaas=# insert into tag_data values (1, 'foo', 'tom'), (1, 'bar', 'dick'), (2, 'baz', 'harry'); INSERT 0 3 rhaas=# select id, string_agg(tag||':'||userid, ', ') tags from tag_data group by 1; id | tags +--- 1 | foo:tom, bar:dick 2 | baz:harry (2 rows) rhaas=# set enable_seqscan=false; SET rhaas=# select id, string_agg(tag||':'||userid, ', ') tags from tag_data group by 1; id | tags +--- 1 | bar:dick, foo:tom 2 | baz:harry (2 rows) Now texteq() is just a glorified version of memcmp(), so what does this complaint possibly have to do with Kevin's patch, or even materialized views in general? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: That's the point, and the whole point. You have not shown that it doesn't. You have not shown why adding a 12th non-default opclass is a particular problem here (although we have a consensus to use different operators, to reserve this operator namespace for other things). We need justification to add operators, imv, especially ones that expose our internal binary representation of data. I worry that adding these will come back to bite us later and that we're making promises we won't be able to keep. If these inconsistencies in what happens with these data types are an issue then REFRESH can be handled as a wholesale DELETE/INSERT. Trying to do this incremental-but-not-really maintenance where the whole query is run but we try to skimp on what's actually getting updated in the matview is a premature optimization, imv, and one which may be less performant and more painful, with more gotchas and challenges for our users, to deal with in the long run. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: We need justification to add operators, imv, especially ones that expose our internal binary representation of data. I believe I have done so. I worry that adding these will come back to bite us later How? and that we're making promises we won't be able to keep. The promise that a concurrent refresh will produce the same set of rows as a non-concurrent one? If these inconsistencies in what happens with these data types are an issue then REFRESH can be handled as a wholesale DELETE/INSERT. I have real-life experience with handling faux materialized views by creating tables (at Wisconsin Courts). We initially did it the way you describe, but the run time was excessive (in Milwaukee County the overnight run did not always complete before the start of business hours the next day). Switching to logic similar to what I've implemented here, it completed an order of magnitude faster, and generated a small fraction of the WAL and logical replication data that the technique you describe did. Performing preliminary steps in temporary tables to minimize the changes needed to permanent tables seems beneficial enough on the face of it that I think the burden of proof should be on someone arguing that deleting all rows and re-inserting (in the same transaction) is, in general, a superior approach to finding the differences and applying only those/ Trying to do this incremental-but-not-really maintenance where the whole query is run but we try to skimp on what's actually getting updated in the matview is a premature optimization, imv, and one which may be less performant and more painful, with more gotchas and challenges for our users, to deal with in the long run. I have the evidence of a ten-fold performance improvement plus minimized WAL and replication work on my side. What evidence do you have to back your assertions? (Don't forget to work in bloat and vacuum truncation issues to the costs of your proposal.) -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: Stephen Frost sfr...@snowman.net wrote: I worry that adding these will come back to bite us later How? User misuse is certainly one consideration, but I wonder what's going to happen if we change our internal representation of data (eg: numerics get changed again), or when incremental matview maintenance happens and we start looking at subsets of rows instead of the entire query. Will the first update of a matview after a change to numeric's internal data structure cause the entire thing to be rewritten? and that we're making promises we won't be able to keep. The promise that a concurrent refresh will produce the same set of rows as a non-concurrent one? The promise that we'll always return the binary representation of the data that we saw last. When greatest(x,y) comes back 'false' for a MAX(), we then have to go check well, does the type consider them equal?, because, if the type considers them equal, we then have to decide if we should replace x with y anyway, because it's different at a binary level. That's what we're saying we'll always do now. We're also saying that we'll replace things based on plan differences rather than based on if the rows underneath actually changed at all. We could end up with material differences in the result of matviews updated through incremental REFRESH and matviews updated through actual incremental mainteance- and people may *care* about those because we've told them (or they discover) they can depend on these types of changes to be reflected in the result. Trying to do this incremental-but-not-really maintenance where the whole query is run but we try to skimp on what's actually getting updated in the matview is a premature optimization, imv, and one which may be less performant and more painful, with more gotchas and challenges for our users, to deal with in the long run. I have the evidence of a ten-fold performance improvement plus minimized WAL and replication work on my side. What evidence do you have to back your assertions? (Don't forget to work in bloat and vacuum truncation issues to the costs of your proposal.) I don't doubt that there are cases in both directions and I'm not trying to argue that it'd always be faster, but I doubt it's always slower. I'm surprised that you had a case where the query was apparently quite fast yet the data set hardly changed and resulted in a very large result but I don't doubt that it happened. What I was trying to get at is really that the delete/insert approach would be good enough in very many cases and it wouldn't have what look, to me anyway, as some pretty ugly warts around these cases. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: Kevin Grittner (kgri...@ymail.com) wrote: Stephen Frost sfr...@snowman.net wrote: I worry that adding these will come back to bite us later How? User misuse is certainly one consideration, I think that one has been talked to death already, with the consensus that we should use different operator names and document them as a result. Any comparison operator can be misused. Andres said he has seen the operators from text_pattern_ops used in production; but we have not removed, for example, ~=~ from our text operators as a result. but I wonder what's going to happen if we change our internal representation of data (eg: numerics get changed again), or when incremental matview maintenance happens and we start looking at subsets of rows instead of the entire query. Will the first update of a matview after a change to numeric's internal data structure cause the entire thing to be rewritten? Something like that could happen, if the newly generated values, for example, all took less space than the old. On the first REFRESH on the new version of the software it might delete and insert all rows; then it would stay with the new representation. I have trouble seeing that as a problem, since presumably we only come up with a new representation because it is smaller, faster, or cures some bug. and that we're making promises we won't be able to keep. The promise that a concurrent refresh will produce the same set of rows as a non-concurrent one? The promise that we'll always return the binary representation of the data that we saw last. When greatest(x,y) comes back 'false' for a MAX(), we then have to go check well, does the type consider them equal?, because, if the type considers them equal, we then have to decide if we should replace x with y anyway, because it's different at a binary level. That's what we're saying we'll always do now. I'm having a little trouble following that. The query runs as it always has, with all the old definitions of comparisons. After it is done, we check whether the rows are the same. The operation of MAX() will not be affected in any way. If MAX() returns a value which is not the same as what the matview has, the matview will be modified to match what MAX() returned. We're also saying that we'll replace things based on plan differences rather than based on if the rows underneath actually changed at all. Only if the user uses a query which does not produce deterministic results. We could end up with material differences in the result of matviews updated through incremental REFRESH and matviews updated through actual incremental mainteance- and people may *care* about those because we've told them (or they discover) they can depend on these types of changes to be reflected in the result. Perhaps we should document the recommendation that people not create materialized views with non-deterministic results, but I don't think that should be a hard restriction. For example, I could see someone creating a materialized view to pick a random sample from a jury pool to be the on the jury panel for today (from which juries are selected on that day), and not want that to be predictable and not want it to change until the next refresh of the panel matview. (That would not be my first design choice, but it would not be a horrid choice if there were sufficient logging of the REFRESH statements in a place the user could not modify.) Trying to do this incremental-but-not-really maintenance where the whole query is run but we try to skimp on what's actually getting updated in the matview is a premature optimization, imv, and one which may be less performant and more painful, with more gotchas and challenges for our users, to deal with in the long run. I have the evidence of a ten-fold performance improvement plus minimized WAL and replication work on my side. What evidence do you have to back your assertions? (Don't forget to work in bloat and vacuum truncation issues to the costs of your proposal.) I don't doubt that there are cases in both directions and I'm not trying to argue that it'd always be faster, but I doubt it's always slower. Sure. We provide a way to support those cases, although it involves blocking. So far, even the tests I expected to be faster with heap replacement have come out marginally slower that way than with CONCURRENTLY, due to index activity; but I have no doubt cases can be constructed that go the other way. I'm surprised that you had a case where the query was apparently quite fast yet the data set hardly changed and resulted in a very large result but I don't doubt that it happened. The history of all events in the county (tens of millions of records in Milwaukee County) need to be scanned to generate the status of cases and the date of last activity, as well as scanning all future calendar events to see what is scheduled. This is compared to tables setting
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: Stephen Frost sfr...@snowman.net wrote: The promise that we'll always return the binary representation of the data that we saw last. When greatest(x,y) comes back 'false' for a MAX(), we then have to go check well, does the type consider them equal?, because, if the type considers them equal, we then have to decide if we should replace x with y anyway, because it's different at a binary level. That's what we're saying we'll always do now. I'm having a little trouble following that. You're looking at it from the perspective of what's committed today, I think. The query runs as it always has, with all the old definitions of comparisons. After it is done, we check whether the rows are the same. The operation of MAX() will not be affected in any way. If MAX() returns a value which is not the same as what the matview has, the matview will be modified to match what MAX() returned. I'm referring to a case where we're doing incremental maintenance of the matview (in the far distant future..) and all we've got is the current MAX value and the value from the row being updated. We're going to update that MAX on cases where the values are equal-but-binary-different. We're also saying that we'll replace things based on plan differences rather than based on if the rows underneath actually changed at all. Only if the user uses a query which does not produce deterministic results. Which is apt to happen.. We could end up with material differences in the result of matviews updated through incremental REFRESH and matviews updated through actual incremental mainteance- and people may *care* about those because we've told them (or they discover) they can depend on these types of changes to be reflected in the result. Perhaps we should document the recommendation that people not create materialized views with non-deterministic results, but I don't think that should be a hard restriction. I wasn't suggesting a hard restriction, but I was postulating about how the behavior may change in the future (or we may need to do very hacky things to prevent it from channging) and how these decisions may impact that. What I was trying to get at is really that the delete/insert approach would be good enough in very many cases and it wouldn't have what look, to me anyway, as some pretty ugly warts around these cases. I guess we could add a DELETE everything and INSERT the new version of everything option for REFRESH in addition to what is there now, but I would be very reluctant to use it as a replacement. It wouldn't address my concerns anyway, which are around these binary operators and the update-in-place approach being risky and setting us up for problems down the road. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On Tue, Sep 24, 2013 at 12:00 PM, Stephen Frost sfr...@snowman.net wrote: It wouldn't address my concerns anyway, which are around these binary operators and the update-in-place approach being risky and setting us up for problems down the road. I think that if you want to hold up a bug fix to a feature that's already committed, you need to be more specific than to say that there might be problems down the road. I can't see that you've identified any consequence of this change that is clearly bad. I understand that you don't like the idea of making binary-comparison operators user-visible due to the risk of user-confusion, but we have lots of confusing features already and we handle that by documenting them. This one doesn't seem any worse than average; in fact, to me it seems rather minor. At any rate, I think we're getting distracted from the real point here. Here's what this patch is attempting to fix: rhaas=# create table sales (txn_date date, amount numeric); CREATE TABLE rhaas=# insert into sales values ('2013-09-01', '100.00'), ('2013-09-01', '150.00'), ('2013-09-02', '75.0'); INSERT 0 3 rhaas=# create materialized view sales_summary as select txn_date, sum(amount) as amount from sales group by 1; SELECT 2 rhaas=# create unique index on sales_summary (txn_date); CREATE INDEX rhaas=# select * from sales_summary; txn_date | amount + 2013-09-01 | 250.00 2013-09-02 | 75.0 (2 rows) rhaas=# update sales set amount = round(amount, 2); UPDATE 3 rhaas=# refresh materialized view concurrently sales_summary; REFRESH MATERIALIZED VIEW rhaas=# select * from sales_summary; txn_date | amount + 2013-09-01 | 250.00 2013-09-02 | 75.0 (2 rows) rhaas=# refresh materialized view sales_summary; REFRESH MATERIALIZED VIEW rhaas=# select * from sales_summary; txn_date | amount + 2013-09-01 | 250.00 2013-09-02 | 75.00 (2 rows) Note that, after we change the data in the underlying table, the output of the query changes. But REFRESH MATERIALIZED VIEW CONCURRENTLY doesn't fix it. However, REFRESH MATERIALIZED VIEW without CONCURRENTLY does fix it. That's a bug, because if there are two ways of refreshing a materialized view they should both produce the same answer. Shouldn't they? The fact that users can write queries that don't always give the same answer is not a reason why it's OK for REFRESH CONCURRENTLY to misbehave on queries that do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: On Tue, Sep 24, 2013 at 12:00 PM, Stephen Frost sfr...@snowman.net wrote: It wouldn't address my concerns anyway, which are around these binary operators and the update-in-place approach being risky and setting us up for problems down the road. I think that if you want to hold up a bug fix to a feature that's already committed, you need to be more specific than to say that there might be problems down the road. Committed isn't released and I simply can't agree that introducing new operators is a 'bug fix'. Had it gone out as-is, I can't see us back-patching a bunch of new operators to fix it. Note that, after we change the data in the underlying table, the output of the query changes. But REFRESH MATERIALIZED VIEW CONCURRENTLY doesn't fix it. However, REFRESH MATERIALIZED VIEW without CONCURRENTLY does fix it. That's a bug, because if there are two ways of refreshing a materialized view they should both produce the same answer. Shouldn't they? The same queries run over time without changes to the underlying data really should return the same data, shoudln't they? Is it a bug that they don't? In general, I agree that they should produce the same results, as should incrementally maintained views when they happen. I'm not convinced that choosing whatever the 'new' value is to represent the value in the matview for the equal-but-not-binary-identical will always be the correct answer though. The fact that users can write queries that don't always give the same answer is not a reason why it's OK for REFRESH CONCURRENTLY to misbehave on queries that do. This really is, imv, agreeing to hold a higher standard for matviews than we do for what matviews are *based* on- which is queries. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On Tue, Sep 24, 2013 at 1:04 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Tue, Sep 24, 2013 at 12:00 PM, Stephen Frost sfr...@snowman.net wrote: It wouldn't address my concerns anyway, which are around these binary operators and the update-in-place approach being risky and setting us up for problems down the road. I think that if you want to hold up a bug fix to a feature that's already committed, you need to be more specific than to say that there might be problems down the road. Committed isn't released and I simply can't agree that introducing new operators is a 'bug fix'. Had it gone out as-is, I can't see us back-patching a bunch of new operators to fix it. You're perfectly welcome to argue that we should rip REFRESH MATERIALIZED VIEW CONCURRENTLY back out, or change the implementation of it. However, that's not the topic of this thread. Had it gone out as is, we would have either had to come up with some more complex fix that could make things right without catalog changes, or we would have had to document that it doesn't work correctly. Fortunately we are not faced with that conundrum. Note that, after we change the data in the underlying table, the output of the query changes. But REFRESH MATERIALIZED VIEW CONCURRENTLY doesn't fix it. However, REFRESH MATERIALIZED VIEW without CONCURRENTLY does fix it. That's a bug, because if there are two ways of refreshing a materialized view they should both produce the same answer. Shouldn't they? The same queries run over time without changes to the underlying data really should return the same data, shoudln't they? Not necessarily. There are and always have been plenty of cases where that isn't true. Is it a bug that they don't? No. In general, I agree that they should produce the same results, as should incrementally maintained views when they happen. I'm not convinced that choosing whatever the 'new' value is to represent the value in the matview for the equal-but-not-binary-identical will always be the correct answer though. Well, you have yet to provide an example of where it isn't the right behavior. I initially thought that perhaps we needed a type-specific concept of exact equality, so that two things would be exactly equal if they would both be dumped the same way by pg_dump, even if the internal representations were different. But on further thought I'm no longer convinced that's a good idea. For example, consider the compact-numeric format that I introduced a few releases back. The changes are backward compatible, so you can run pg_upgrade on a cluster containing the old format and everything works just fine. But your table will be larger than you really want it to be until you rewrite it. Any materialized view that was built by selecting those numeric values will *also* be larger than you want it to be until you rewrite it. Fortunately, you can shrink the table by using a rewriting variant of ALTER TABLE. After you do that, you will perhaps also want to shrink the materialized view. And in fact, REFRESH will do that for you, but currently, REFRESH CONCURRENTLY won't. It seems to me that it would be nicer if it did. Now I admit that's an arguable point. We could certainly define an intermediate notion of equality that is more equal than whatever = does, but not as equal as exact binary equality. However, such a new notion would have no precedence in our existing code, whereas the use of binary equality does. Going to a lot of work to build up this intermediate notion of equality does not seem like a good idea to me; I think a general rule of good programming is not to invent more ways to do things than are clearly necessary, and requiring every core and extension type to define an exact-equality operator just to support this feature seems like excessive in the extreme. Moreover, I think what to include in that intermediate notion of equality would be kind of hard to decide, because there are a lot of subtly different questions that one can ask, and we'd inevitably have to answer the question how equal does it have to be to be equal enough?. Numerics and arrays have multiple ways of referencing what is intended to be the exact same value, but those can be disambiguated by passing them to a function that cares, like pg_column_size(). Then, on a user-visible level, numerics allow variation in the number of trailing zeroes after the decimal point. Floats have extra digits that aren't shown except with extra_float_digits, so that the value can be less than totally equal even if the text representation is the same. citext intentionally ignores case. In every one of those cases, it's possible that the user of materialized views might say you know, if it's equal enough that the = operator says it's the same, then I really don't mind if the materialized view maintenance gets skipped. But it's also possible that they might say, you know, those
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: Now I admit that's an arguable point. We could certainly define an intermediate notion of equality that is more equal than whatever = does, but not as equal as exact binary equality. I suggested it up-thread and don't recall seeing a response, so here it is again- passing the data through the binary-out function for the type and comparing *that* would allow us to change the interal binary representation of data types and would be something which we could at least explain to users as to why X isn't the same as Y according to this binary operator. I think the conservative (and therefore correct) approach is to decide that we're always going to update if we detect any difference at all. And there may be users who are surprised that a refresh changed parts of the table that have nothing to do with what was changed in the underlying relation, because a different plan was used and the result ended up being binary-different. It's easy to explain to users why that would be when we're doing a wholesale replace but I don't think it'll be nearly as clear why that happened when we're not replacing the whole table and why REFRESH can basically end up changing anything (but doesn't consistently). If we're paying attention to the records changed and only updating the matview's records when they're involved, that becomes pretty clear. What's happening here feels very much like unintended consequences. It is obviously true, and unavoidable, that if the query can produce more than one result set depending on the query plan or other factors, then the materialized view contents can match only one of those possible result sets. But you are arguing that it should be allowed to produce some OTHER result set, that a direct execution of the query could *never* have produced, and I can't see how that can be right. I agree that the matview shouldn't produce things which *can't* exist through an execution of the query. I don't intend to argue against that but I realize that's a fallout of not accepting the patch to implement the binary comparison operators. My complaint is more generally that if this approach needs such then there's going to be problems down the road. No, I can't predict exactly what they are and perhaps I'm all wet here, but this kind of binary-equality operations are something I've tried to avoid since discovering what happens when you try to compare a couple of floating point numbers. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: I think the conservative (and therefore correct) approach is to decide that we're always going to update if we detect any difference at all. And there may be users who are surprised that a refresh changed parts of the table that have nothing to do with what was changed in the underlying relation, because a different plan was used and the result ended up being binary-different. Binary different for equal values could include box (or other geometry) objects moved to completely new locations and/or not quite the same size. Here is v2 of the patch which changes from the universally disliked operator names v1 used. It also fixes bugs in the row comparisons for pass-by-reference types, fixes a couple nearby comments, and adds regression tests for a matview containing a box column. The box type is interesting in that its = operator only worries about the area of the box, and considers two boxes with no more than EPSILON difference to be equal. This means that boxes at totally different locations can be equal, and that if A = B and B = C it is not necessarily true that A = C. This doesn't cause too much trouble in general because boxes don't have btree opclasses. Since there are types, including core types, without a default btree opclass, materialized views containing them cannot use RMVC as currently committed. This patch would fix that, or we could rip out the current implementation and go to the delete everything and insert the entire new matview contents approach. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company*** a/contrib/citext/expected/citext.out --- b/contrib/citext/expected/citext.out *** *** 2276,2278 SELECT like_escape( name::text, ''::citext ) = like_escape( name::text, '' ) AS --- 2276,2319 t (5 rows) + -- Ensure correct behavior for citext with materialized views. + CREATE TABLE citext_table ( + id serial primary key, + name citext + ); + INSERT INTO citext_table (name) + VALUES ('one'), ('two'), ('three'), (NULL), (NULL); + CREATE MATERIALIZED VIEW citext_matview AS + SELECT * FROM citext_table; + CREATE UNIQUE INDEX citext_matview_id + ON citext_matview (id); + SELECT * + FROM citext_matview m + FULL JOIN citext_table t ON (t.id = m.id AND t *= m) + WHERE t.id IS NULL OR m.id IS NULL; + id | name | id | name + +--++-- + (0 rows) + + UPDATE citext_table SET name = 'Two' WHERE name = 'TWO'; + SELECT * + FROM citext_matview m + FULL JOIN citext_table t ON (t.id = m.id AND t *= m) + WHERE t.id IS NULL OR m.id IS NULL; + id | name | id | name + +--++-- + | | 2 | Two + 2 | two || + (2 rows) + + REFRESH MATERIALIZED VIEW CONCURRENTLY citext_matview; + SELECT * FROM citext_matview ORDER BY id; + id | name + +--- + 1 | one + 2 | Two + 3 | three + 4 | + 5 | + (5 rows) + *** a/contrib/citext/expected/citext_1.out --- b/contrib/citext/expected/citext_1.out *** *** 2276,2278 SELECT like_escape( name::text, ''::citext ) = like_escape( name::text, '' ) AS --- 2276,2319 t (5 rows) + -- Ensure correct behavior for citext with materialized views. + CREATE TABLE citext_table ( + id serial primary key, + name citext + ); + INSERT INTO citext_table (name) + VALUES ('one'), ('two'), ('three'), (NULL), (NULL); + CREATE MATERIALIZED VIEW citext_matview AS + SELECT * FROM citext_table; + CREATE UNIQUE INDEX citext_matview_id + ON citext_matview (id); + SELECT * + FROM citext_matview m + FULL JOIN citext_table t ON (t.id = m.id AND t *= m) + WHERE t.id IS NULL OR m.id IS NULL; + id | name | id | name + +--++-- + (0 rows) + + UPDATE citext_table SET name = 'Two' WHERE name = 'TWO'; + SELECT * + FROM citext_matview m + FULL JOIN citext_table t ON (t.id = m.id AND t *= m) + WHERE t.id IS NULL OR m.id IS NULL; + id | name | id | name + +--++-- + | | 2 | Two + 2 | two || + (2 rows) + + REFRESH MATERIALIZED VIEW CONCURRENTLY citext_matview; + SELECT * FROM citext_matview ORDER BY id; + id | name + +--- + 1 | one + 2 | Two + 3 | three + 4 | + 5 | + (5 rows) + *** a/contrib/citext/sql/citext.sql --- b/contrib/citext/sql/citext.sql *** *** 711,713 SELECT COUNT(*) = 19::bigint AS t FROM try; --- 711,736 SELECT like_escape( name, '' ) = like_escape( name::text, '' ) AS t FROM srt; SELECT like_escape( name::text, ''::citext ) = like_escape( name::text, '' ) AS t FROM srt; + + -- Ensure correct behavior for citext with materialized views. + CREATE TABLE citext_table ( + id serial primary key, + name citext + ); + INSERT INTO citext_table (name) + VALUES ('one'), ('two'), ('three'), (NULL), (NULL); + CREATE MATERIALIZED VIEW citext_matview AS + SELECT * FROM citext_table; + CREATE UNIQUE INDEX citext_matview_id + ON citext_matview (id); +
Re: [HACKERS] record identical operator
On Tue, Sep 24, 2013 at 2:22 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: Now I admit that's an arguable point. We could certainly define an intermediate notion of equality that is more equal than whatever = does, but not as equal as exact binary equality. I suggested it up-thread and don't recall seeing a response, so here it is again- passing the data through the binary-out function for the type and comparing *that* would allow us to change the interal binary representation of data types and would be something which we could at least explain to users as to why X isn't the same as Y according to this binary operator. I think the conservative (and therefore correct) approach is to decide that we're always going to update if we detect any difference at all. And there may be users who are surprised that a refresh changed parts of the table that have nothing to do with what was changed in the underlying relation, because a different plan was used and the result ended up being binary-different. It's easy to explain to users why that would be when we're doing a wholesale replace but I don't think it'll be nearly as clear why that happened when we're not replacing the whole table and why REFRESH can basically end up changing anything (but doesn't consistently). If we're paying attention to the records changed and only updating the matview's records when they're involved, that becomes pretty clear. What's happening here feels very much like unintended consequences. FWIW you make some interesting points (I did a triple take on the plan dependent changes) but I'm 100% ok with the proposed behavior. Matviews satisfy 'truth' as *defined by the underlying query only*. This is key: there may be N candidate 'truths' for that query: it's not IMNSHO reasonable to expect the post-refresh truth to be approximately based in the pre-refresh truth. Even if the implementation happened to do what you're asking for it would only be demonstrating undefined but superficially useful behavior...a good analogy would be the old scan behavior where an unordered scan would come up in 'last update order'. That (again, superficially useful) behavior was undefined and we reserved the right to change it. And we did. Unnecessarily defined behaviors defeat future performance optimizations. So Kevin's patch AIUI defines a hitherto non-user accessible (except in the very special case of row-wise comparison) mechanic to try and cut down the number of rows that *must* be refreshed. It may or may not do a good job at that on a situational basis -- if it was always better we'd probably be overriding the default behavior. I don't think it's astonishing at all for matview to pseudo-randomly adjust case over a citext column; that's just part of the deal with equality ambiguous types. As long as the matview doesn't expose a dataset that was impossible to have been generated by the underlying query, I'm good. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Fri, Sep 20, 2013 at 11:23 AM, Stephen Frost sfr...@snowman.net wrote: Perhaps I'm assuming things are farther along than they are.. I was assumed that 'incremental mat view' updates were actuallly 'partial'- iow, that it keeps track of the rows in the mat view which are impacted by the set of rows in the base relations and then runs specific queries to pull out and operate on the base-rel set-of-rows to update the specific mat-view-rows. If it's running the whole query again then it's certainly more likely to get the same results that the user did, but it's not a guarantee and that's only a happy coincidence while we're still doing the whole query approach (which I hope we'd eventually progress beyond..). It seems odd to me that you have such strong opinions about what is and is not acceptable here given that you don't seem to familiar with the current state of this work. I will attempt to summarize my understanding. In 9.3, we can refresh a materialized view by taking an access exclusive lock on the relation, rerunning the query, and replacing the contents wholesale. In master, there is a new option to perform the refresh concurrently, which is documented here: http://www.postgresql.org/docs/devel/static/sql-refreshmaterializedview.html It reruns the query in its entirety and then figures out what inserts, updates, and deletes are needed to make the matview match the output of the query (see refresh_by_match_merge). This is an improvement over what is available in 9.3, because even though you still have to rerun the full query, you don't have to lock out access to the table in order to apply the changes. However, currently, it sometimes fails to perform updates that are needed to make the contents of the view match the query output, because it relies on a notion of equality other than exact equality. Kevin is proposing to fix this problem via this patch. Now, the next project Kevin's going to work on, and that he was working on when he discovered this problem, is incremental maintenance: that is, allowing us to update the view *without* needing to rerun the entire query. This record comparison operator will be just as important in that context. The *only* strategy refreshing a materialized view that *doesn't* need this operator is the only we have in 9.3: through out all the old data and replace it with the result of re-executing the query. If you want anything smarter than that, you MUST compare old and new rows for equality so that you can update only those rows that have been changed. And if you compare them *any strategy other than the one Kevin is proposing*, namely binary equality, then you may sometimes decide that a row has not been changed when it really has, and then you won't update the row, and then incremental maintenance will be enabled to produce *wrong answers*. So to me this has come to seem pretty much open and shut. We all know that materialized views are not going to always match the data returned by the underlying query. Perhaps the canonical example is SELECT random(). As you pointed out upthread, any query that is nondeterministic is a potential problem for materialized views. When you write a query that can return different output based on the order in which input rows are scanned, or based on any sort of external state such as a random-number generator, each refresh of a materialized view based on that query may produce different answers. There's not a lot we can do about that, except tell people to avoid using such queries in materialized views that they expect to be stable. However, what we're talking about here is a completely separate problem. Even if the query is 100% deterministic, without this patch, the materialized view can get out of sync with the query results. Granted, most of the ways in which it can get out of sync are fairly subtle: failing to preserve case in a data type where comparisons are text-insensitive; gaining or loosing an all-zeroes null bitmap on an array type; adding or removing trailing zeroes after the decimal point in a numeric. If the materialized view sometimes said 1 when the query was returning 0, we'd presumably all say that's a bug, let's fix it. But what we're actually talking about is that the query returns 0.00 and the view still says zero. So we're doing a lot of soul-searching about whether that's unequal enough to justify updating the row. Personally, though, there's not a lot of doubt in my mind. If I create a table and I put 0 into a column of that table and then create a materialized view and that 0 ends up in the materialized view, and then I update the 0 to 0.00 and refresh the view, I expect that change to propagate through to the materialized view. It works that way if I select from a regular, non-materialized view; and it also works that way if I select from the table itself. The idea that materialized views should somehow be exempt from reflecting changes to the underlying data in
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: It seems odd to me that you have such strong opinions about what is and is not acceptable here given that you don't seem to familiar with the current state of this work. That'd be because the arguments that I've been trying to make around this havn't had terribly much to do with the current state of this work but rather the higher level concepts- which are more important anyway, imv. I will attempt to summarize my understanding. In 9.3, we can refresh a materialized view by taking an access exclusive lock on the relation, rerunning the query, and replacing the contents wholesale. In master, there is a new option to perform the refresh concurrently, which is documented here: http://www.postgresql.org/docs/devel/static/sql-refreshmaterializedview.html It reruns the query in its entirety and then figures out what inserts, updates, and deletes are needed to make the matview match the output of the query (see refresh_by_match_merge). So I've gathered and I'm not terribly surprised that it's broken. It was unfortunate that we committed it as such, imv. I'm *not* convinced that means we should build on that in a direction that doesn't appear to be getting us any closer to real matviews. Being in master doesn't make it released. This is an improvement over what is available in 9.3, because even though you still have to rerun the full query, you don't have to lock out access to the table in order to apply the changes. I see how you can view it as an improvement, but consider what you'd say if a user showed up on IRC wanting to implement *exactly* this in their DB using a cronjob and a bit of plpgsql. Would we encourage him/her and say gee, that's a great approach, just compare the whole row! Heck no, we'd say uhh, you need a key there that you can depend on when you're doing your updates. We'd probably even suggest that they, I dunno, make that key their *primary key* for the matview. If they asked what would happen with their little plpgsql code when using citext or other, similar, type, we'd tell them exactly what would happen when using such a type with regular updates, and maybe even describe how they could get themselves into trouble if they tried to issue updates which changed those key fields and therefore run into possible lock contention from the unique index on values-that-are-not-really-unique-to-them. I can't see anyone encouraging that and here we are building a major new feature on it! To be honest, I'm amazed that I appear to be alone with this. Now, the next project Kevin's going to work on, and that he was working on when he discovered this problem, is incremental maintenance: that is, allowing us to update the view *without* needing to rerun the entire query. This record comparison operator will be just as important in that context. You state this but I don't see where you justify this claim.. The *only* strategy refreshing a materialized view that *doesn't* need this operator is the only we have in 9.3: through out all the old data and replace it with the result of re-executing the query. I really don't agree with this, but allow me to play along a bit. What is going to happen with this incremental updates when you want to use an index to figure out what to update? Are you going to build the index using your binary equality operator, which could allow *duplicates that are not really duplicates*, even in the externally-visible world? And then we'll be returning multiple rows to the user when we shouldn't be? There's a whole slew of evil examples of this- it's called unnecessary DISTINCT's. Or perhaps you'll simply look up based on the actual equality answer and then *pick one* to use, perhaps the most recent, but then that may or may not equal what the actual query would generate *anyway* because of costing, join ordering, etc, etc, as I already pointed out. You're trying to guarantee something here that you can't. Trying to fake it and tell the users oh, you'll be ok is actually worse because it means they'll try and depend on it and then get all kinds of upset when it ends up *not* working. I'd much rather tell users look, if you want to use these for *data* fields, that's fine, because our smart matview will figure out the keys associated with a GROUP BY and will update based on those, but if you GROUP BY a column with a type whose equality operator allows things to be different, you'll get *the same behavior you would get from the query*, which is to say, it won't be deterministic. If you want anything smarter than that, you MUST compare old and new rows for equality so that you can update only those rows that have been changed. And if you compare them *any strategy other than the one Kevin is proposing*, namely binary equality, then you may sometimes decide that a row has not been changed when it really has, and then you won't update the row, and then incremental maintenance
Re: [HACKERS] record identical operator
On Mon, Sep 23, 2013 at 1:19 PM, Stephen Frost sfr...@snowman.net wrote: So I've gathered and I'm not terribly surprised that it's broken. It was unfortunate that we committed it as such, imv. I'm *not* convinced that means we should build on that in a direction that doesn't appear to be getting us any closer to real matviews. Being in master doesn't make it released. This is an improvement over what is available in 9.3, because even though you still have to rerun the full query, you don't have to lock out access to the table in order to apply the changes. I see how you can view it as an improvement, but consider what you'd say if a user showed up on IRC wanting to implement *exactly* this in their DB using a cronjob and a bit of plpgsql. Would we encourage him/her and say gee, that's a great approach, just compare the whole row! Heck no, we'd say uhh, you need a key there that you can depend on when you're doing your updates. We'd probably even suggest that they, I dunno, make that key their *primary key* for the matview. If they asked what would happen with their little plpgsql code when using citext or other, similar, type, we'd tell them exactly what would happen when using such a type with regular updates, and maybe even describe how they could get themselves into trouble if they tried to issue updates which changed those key fields and therefore run into possible lock contention from the unique index on values-that-are-not-really-unique-to-them. I can't see anyone encouraging that and here we are building a major new feature on it! To be honest, I'm amazed that I appear to be alone with this. I've written various scripts over the years for this kind of thing, and they all compared the whole row. I used a PK comparison to determine which row needed to be updated and then compared the remaining fields to figure out which values needed to be updated. I had assumed that's how most people did it; what do you do? Anyway, that is exactly what Kevin is proposing to do here and, to be clear, he's NOT proposing to use the binary-identical semantics to identify the row to be updated. That will happen using the semantics of whatever index the user chooses to create on the PK column. Rather, he's only using the binary-identical to decide which rows are completely unchanged. I might be wrong here, but it seems to me that that renders most of your argument here moot. Under Kevin's proposal, changing a citext column that acts as a PK for the matview WON'T cause the row to be deleted and reinserted, but what it will do is say, oh, it's the same row (the values are equal) but the case is different (the rows are not binary-equal), let me go update the PK column with the new value. From where I stand, that seems like exactly the right behavior. What do you think should happen instead? Now, the next project Kevin's going to work on, and that he was working on when he discovered this problem, is incremental maintenance: that is, allowing us to update the view *without* needing to rerun the entire query. This record comparison operator will be just as important in that context. You state this but I don't see where you justify this claim.. The next sentence was intended as justification. The *only* strategy refreshing a materialized view that *doesn't* need this operator is the only we have in 9.3: through out all the old data and replace it with the result of re-executing the query. I really don't agree with this, but allow me to play along a bit. What is going to happen with this incremental updates when you want to use an index to figure out what to update? We already do use an index to figure out what to update with concurrent refresh; or at least we can, when the query planner thinks that is the cheapest strategy. I don't imagine that that part will change significantly for incremental updates. Are you going to build the index using your binary equality operator, which could allow *duplicates that are not really duplicates*, even in the externally-visible world? And then we'll be returning multiple rows to the user when we shouldn't be? No, see above. There's no intention that the user should need to use this new opclass to make materialized views work correctly. The user is required to define a unique index in order to use REFRESH MATERIALIZED VIEW CONCURRENTLY, and that unique index defines the semantics used to compare existing rows to new candidate rows. Only after we've matched up the old and new rows do we use the exact-equality semantics to decide whether to update the existing row or do nothing. There's a whole slew of evil examples of this- it's called unnecessary DISTINCT's. Or perhaps you'll simply look up based on the actual equality answer and then *pick one* to use, perhaps the most recent, Sure, all of that would suck, but nobody's proposing anything like that. but then that may or may not equal what the actual query
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: Robert Haas (robertmh...@gmail.com) wrote: Now, the next project Kevin's going to work on, and that he was working on when he discovered this problem, is incremental maintenance: that is, allowing us to update the view *without* needing to rerun the entire query. This record comparison operator will be just as important in that context. You state this but I don't see where you justify this claim.. Unless we can tell whether there are any differences between two versions of a row, we can't accurately generate the delta to drive the incremental maintenance. The initial thread discussing how incremental maintenance would be done is here: http://www.postgresql.org/message-id/flat/1368561126.64093.yahoomail...@web162904.mail.bf1.yahoo.com#1368561126.64093.yahoomail...@web162904.mail.bf1.yahoo.com The thread on the initial patch for REFRESH MATERIALIZED VIEW CONCURRENTLY started with: | Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY | for 9.4 CF1. The goal of this patch is to allow a refresh | without interfering with concurrent reads, using transactional | semantics. | | It is my hope to get this committed during this CF to allow me to | focus on incremental maintenance for the rest of the release cycle. There was much discussion, testing, and revision then, which is here: http://www.postgresql.org/message-id/flat/1371225929.28496.yahoomail...@web162905.mail.bf1.yahoo.com#1371225929.28496.yahoomail...@web162905.mail.bf1.yahoo.com I think pretty much every concern raised was addressed except for a lingering doubt expressed by Noah over whether IS NOT DISTINCT FROM semantics were really the right basis for matching rows. Based on that feedback, I spent a lot of time looking at why that might or might not be correct, and decided that I had been wrong to base the behavior on that, for the reasons Robert expressed so clearly a couple messages back. Hence this patch. I had made the mistake of using an operator which used a column-by-column comparison based on the equality operator for the default opclass for comparing two values of each respective column type. I had been challenged on that in the review process, and was responding to it with the fix contained in this patch. The first post on this thread starts with: | Attached is a patch for a bit of infrastructure I believe to be | necessary for correct behavior of REFRESH MATERIALIZED VIEW | CONCURRENTLY as well as incremental maintenance of matviews. I think it is fairly obvious that REFRESH should REgenerate a FRESH copy of the data, versus incremental maintenance -- which attempts to keep the matview up-to-date without regenerating the full set of data. Whenever there is logical replication (and materialized views are, conceptually, one form of that -- within the database) I feel it is important to be able to correct any possible drift. With matviews, I see the way to do that as the REFRESH command, and I feel that it is important to be able to do that in a way that can run concurrently with readers of the matview -- without blocking them or being blocked by them. Discussion of incremental maintenance really belongs on a different thread. Since I have gone to the trouble to read a lot of papers on the topic, and select one that I think is a good basis for our implementation, I hope everyone will frame discussion in terms of either: - how best to implement the techniques from that paper, or - why some other paper presents a better technique. I think it would be madness to approach implementing incremental maintenance based on an ad hoc set of thoughts rather than a peer-reviewed paper. I know how tempting it is it start from zero and think, if we just do X we can cover this sort of query. I had a few thoughts like that before I read all those papers and discovered that there were many subtle issues to cover. We will have plenty of time to get creative with alternatives when we get past the query types specifically addressed in the paper and begin to move into, for example, CTEs and window functions. I certainly expect robust discussion around those areas, or even some of the infrastructure for capturing deltas and triggering the incremental maintenance. I really didn't expect to have to burn so much time and energy arguing over whether a REFRESH should leave the matview accurately containing the results of the matview's query. We can argue about how it should be named After looking at the existing suggestions and thinking about it I'm leaning toward these operators (based on a star in front of the usual default comparison operators): *= * * *= * *= and whether it should be documented I thought we had a consensus to document both the existing record comparison operators and these new ones, and I'm fine with that. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: Unless we can tell whether there are any differences between two versions of a row, we can't accurately generate the delta to drive the incremental maintenance. This is predicated on the assumption that you simply generate the new view and then try to figure out what to go update. I'm trying to explain that using that methodology is what landed us in this situation to begin with. The initial thread discussing how incremental maintenance would be done is here: My apologies for not paying quite as close attention to that thread as I should have. I think it is fairly obvious that REFRESH should REgenerate a FRESH copy of the data, versus incremental maintenance -- which attempts to keep the matview up-to-date without regenerating the full set of data. Having 'REFRESH' regenerate a fresh copy of the data makes sense to me, and is what we have now, no? The only issue there is that it takes out a big lock, which I appreciate that you're trying to get rid of. Whenever there is logical replication (and materialized views are, conceptually, one form of that -- within the database) I feel it is important to be able to correct any possible drift. With matviews, I see the way to do that as the REFRESH command, and I feel that it is important to be able to do that in a way that can run concurrently with readers of the matview -- without blocking them or being blocked by them. Of course. Discussion of incremental maintenance really belongs on a different thread. I'm really getting tired of everyone saying this is the only way to do it (or perhaps well, this is already committed, therefore it must be what we're gonna do) when a) we're already planning to rip this out and change it, or so I thought, and b) we're trying to make promises we can't keep with this approach. Since I have gone to the trouble to read a lot of papers on the topic, and select one that I think is a good basis for our implementation, I hope everyone will frame discussion in terms of either: - how best to implement the techniques from that paper, or - why some other paper presents a better technique. My recollection from the hackers meeting is that I'm trying to simply paraphrase what you had said was in the paper wrt keeping track of what rows are changed underneath and using that as a basis to implement the changes necessary in the view. Does the paper you're referring to describe rerunning the whole query and then trying to figure out what's been changed..? That's really what I'm having trouble understanding why anyone would want to implement. I'll try and find time to hunt down the threads and papers on it, but I really could have sworn this was gone over at the hacker meeting- and it made a lot of sense to me, then. I really didn't expect to have to burn so much time and energy arguing over whether a REFRESH should leave the matview accurately containing the results of the matview's query. I appreciate you bringing me up to speed on where things actually are here- again, sorry for not realizing the direction that this was going in earlier; it really didn't even occur to me that it would have gone down this road. I, also, didn't expect to spend so much time on this. We can argue about how it should be named Really, I'm back to trying to figure out why we want to go down this road at all. and whether it should be documented I thought we had a consensus to document both the existing record comparison operators and these new ones, and I'm fine with that. If it gets added, it certainly should be documented, and heavily caveated. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On Mon, Sep 23, 2013 at 2:46 PM, Robert Haas robertmh...@gmail.com wrote: Anyway, that is exactly what Kevin is proposing to do here and, to be clear, he's NOT proposing to use the binary-identical semantics to identify the row to be updated. That will happen using the semantics of whatever index the user chooses to create on the PK column. Rather, he's only using the binary-identical to decide which rows are completely unchanged. I might be wrong here, but it seems to me that that renders most of your argument here moot. Under Kevin's proposal, changing a citext column that acts as a PK for the matview WON'T cause the row to be deleted and reinserted, but what it will do is say, oh, it's the same row (the values are equal) but the case is different (the rows are not binary-equal), let me go update the PK column with the new value. From where I stand, that seems like exactly the right behavior. What do you think should happen instead? Ah, I'm wrong here. It is true that we use the unique index semantics to compare the results of rerunning the query to what's in the view, but it's not true (currently) that we ever do updates on the view as opposed to delete-and-reinsert. Apparently that code got (mostly) ripped out at some point, but I was confused by a comment that wasn't fully updated to reflect the new reality. Still, I believe that most of the points I'm making here remain valid, because the key assumption you seem to be making is that Kevin is proposing to use binary-identical semantics throughout, and that's not true. Old rows and new candidate rows are matched up using the user-specified opclass, but binary-identical. Binary-identical just determines whether to replace the rows. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Robert, all, Skipping out on much of the side-discussion to try and drive at the heart of this.. * Robert Haas (robertmh...@gmail.com) wrote: I would suggest that you read the referenced papers for details as to how the algorithm works. To make a long story short, they do need to keep track of what's changed, and how. I suppose it's good to know that I wasn't completely misunderstanding the discussion in Ottawa. However, that still seems largely orthogonal to the present discussion. It *solves* this issue, from where I'm sitting, without any binary operators at all. rows 4, 5, 6 are used to compose matrow 1. When 4, 5, or 6 are updated, matrow 1 gets updated. When the update happens, and it *will* happen (there's no question about oh, should we actually update this record?), it's a normal PG update and all of the values in the row get set to whatever the new values are. The only reason we've having any of this discussion is because, in the current implementation, aiui anyway, we're saying oh, the user wants us to update the matview, but we have *no clue* what actually changed, so we're just gonna try and guess.. This is changing that from we're gonna try and guess.. to well, if they aren't *binary identical*, we'll change them. However, if you're actually tracking what's changing and rebuilding the rows based on that, there's no question about binary equality, and you're still only updating the rows that actually need to be updated. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: I'm trying to explain that using that methodology is what landed us in this situation to begin with. I'm trying to figure out what situation you think we're in. Seriously, if you could apply the patch and show one example that demonstrates what you see to be a problem, that would be great. I think it is fairly obvious that REFRESH should REgenerate a FRESH copy of the data, versus incremental maintenance -- which attempts to keep the matview up-to-date without regenerating the full set of data. Having 'REFRESH' regenerate a fresh copy of the data makes sense to me, and is what we have now, no? The only issue there is that it takes out a big lock, which I appreciate that you're trying to get rid of. Whenever there is logical replication (and materialized views are, conceptually, one form of that -- within the database) I feel it is important to be able to correct any possible drift. With matviews, I see the way to do that as the REFRESH command, and I feel that it is important to be able to do that in a way that can run concurrently with readers of the matview -- without blocking them or being blocked by them. Of course. Discussion of incremental maintenance really belongs on a different thread. I'm really getting tired of everyone saying this is the only way to do it (or perhaps well, this is already committed, therefore it must be what we're gonna do) What I'm saying is that REFRESH and incremental maintenance are two different things, and conflating them just confuses everything. when a) we're already planning to rip this out and change it, or so I thought, The entire change to matview-specific code is to use a different operator in two places. Outside of that, it consists of adding the 12th non-default opclass to core. and b) we're trying to make promises we can't keep with this approach. I don't see any such. If you do, please describe them; or better yet, give an example. Since I have gone to the trouble to read a lot of papers on the topic, and select one that I think is a good basis for our implementation, I hope everyone will frame discussion in terms of either: - how best to implement the techniques from that paper, or - why some other paper presents a better technique. My recollection from the hackers meeting is that I'm trying to simply paraphrase what you had said was in the paper wrt keeping track of what rows are changed underneath and using that as a basis to implement the changes necessary in the view. Does the paper you're referring to describe rerunning the whole query and then trying to figure out what's been changed..? That's really what I'm having trouble understanding why anyone would want to implement. I'll try and find time to hunt down the threads and papers on it, but I really could have sworn this was gone over at the hacker meeting- and it made a lot of sense to me, then. The only thing the paper says on the topic is that any incremental maintenance scheme is a heuristic. There will always be cases when it would be faster and less resource-intensive to regenerate the data from the defining query. There is at least an implication that a good implementation will try to identify when it is in such a situation, and ignore the whole incremental maintenance approach in favor of what we are doing with REFRESH. The example they give is if there is an unqualified DELETE of every row in a table which is part of an inner join generating the result, that it would almost be faster to to generate the (empty) result set than to run their algorithm to determine that all the rows need to be deleted. One reason for having a REFRESH that re-runs the query like this is that it *is* a recommended escape hatch when a mass operation makes the incremental calculations too expensive. I really didn't expect to have to burn so much time and energy arguing over whether a REFRESH should leave the matview accurately containing the results of the matview's query. I appreciate you bringing me up to speed on where things actually are here- again, sorry for not realizing the direction that this was going in earlier; it really didn't even occur to me that it would have gone down this road. I, also, didn't expect to spend so much time on this. We can argue about how it should be named Really, I'm back to trying to figure out why we want to go down this road at all. and whether it should be documented I thought we had a consensus to document both the existing record comparison operators and these new ones, and I'm fine with that. If it gets added, it certainly should be documented, That seems to be the consensus. In fact, I would have submitted that with this patch if there had been any documentation for the default record comparison operators. It seemed like that might have been omitted on purpose, and it seemed weird to add documentation for a non-default operator for
Re: [HACKERS] record identical operator
On Mon, Sep 23, 2013 at 3:45 PM, Stephen Frost sfr...@snowman.net wrote: Robert, all, Skipping out on much of the side-discussion to try and drive at the heart of this.. * Robert Haas (robertmh...@gmail.com) wrote: I would suggest that you read the referenced papers for details as to how the algorithm works. To make a long story short, they do need to keep track of what's changed, and how. I suppose it's good to know that I wasn't completely misunderstanding the discussion in Ottawa. However, that still seems largely orthogonal to the present discussion. It *solves* this issue, from where I'm sitting, without any binary operators at all. rows 4, 5, 6 are used to compose matrow 1. When 4, 5, or 6 are updated, matrow 1 gets updated. When the update happens, and it *will* happen (there's no question about oh, should we actually update this record?), it's a normal PG update and all of the values in the row get set to whatever the new values are. I don't know why there shouldn't be a question about that. Suppose that the MAX() aggregate is in use. If 4 or 5 or 6 is updated so as to change the maximum of the three, then matrow 1 needs updating. But if the maximum remains the same, then it doesn't. The right way to decide whether it needs updating is to re-aggregate those three rows and then see whether you get the same (read: binary identical) out of the aggregate that you got the last time you ran it. Also, suppose the same statement updates row 4, row 5, and row 6. Instead of updating the materialized view three times, you do it just once at end-of-statmement, like an AFTER STATEMENT trigger that somehow knows which rows were updated. In this case even something like AVG() could produce the same result as it did before the update. And you'd surely want to avoid updating the matview if the new value was the same as what was already stored in the matview (but not if it was equal but not the same). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: I don't know why there shouldn't be a question about that. Because anything else would be an internal optimization which must be proven to be correct, imv, also.. Suppose that the MAX() aggregate is in use. If 4 or 5 or 6 is updated so as to change the maximum of the three, then matrow 1 needs updating. But if the maximum remains the same, then it doesn't. The right way to decide whether it needs updating is to re-aggregate those three rows and then see whether you get the same (read: binary identical) out of the aggregate that you got the last time you ran it. You could argue the same about PG doing that for any row update- check if anything is actually *binary* different and, if not, then don't update it. Of course, there's questions about if that's right and what about triggers, etc.. Also, suppose the same statement updates row 4, row 5, and row 6. Instead of updating the materialized view three times, you do it just once at end-of-statmement, like an AFTER STATEMENT trigger that somehow knows which rows were updated. Sorry if I wasn't clear, but that's exactly what I was trying to describe regarding how it should work. I was NOT intending to suggest that each update immediately update the matview. It's just that we keep *track* of what was updated and then, at some convenient point, actually run the process to update the matview rows (maybe in an AFTER statement, maybe 5 minutes from now). In this case even something like AVG() could produce the same result as it did before the update. Sure it could. And you'd surely want to avoid updating the matview if the new value was the same as what was already stored in the matview (but not if it was equal but not the same). I don't see why updating a row that was built with AVG() should be avoided over a row that was built with MAX(), unless you're suggesting there's a different set of rows involved in the two or there's some additional optimization around figuring out if these particular changes *should* actually change the result. That's an analysis which could still happen and wouldn't need to rely on any binary equality test, and it'd need to have a whole lot more smarts than this approach anyway or you'll still end up running a query against all of the rows involved in the AVG() to then only decide at the last moment to not update the row, which doesn't strike me as a great optimization. Perhaps that's why we didn't implement it for PG itself? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: The only thing the paper says on the topic is that any incremental maintenance scheme is a heuristic. There will always be cases when it would be faster and less resource-intensive to regenerate the data from the defining query. There is at least an implication that a good implementation will try to identify when it is in such a situation, and ignore the whole incremental maintenance approach in favor of what we are doing with REFRESH. The example they give is if there is an unqualified DELETE of every row in a table which is part of an inner join generating the result, that it would almost be faster to to generate the (empty) result set than to run their algorithm to determine that all the rows need to be deleted. One reason for having a REFRESH that re-runs the query like this is that it *is* a recommended escape hatch when a mass operation makes the incremental calculations too expensive. I was just chatting about this with a coworker and we see a *lot* of cases where an unqualified DELETE and then INSERT would be perfectly acceptable over the existing heavy lock that's taken out. We're already doing that in a lot of places because it's actually pretty cheap and the view results aren't all that big. It's also completely defensible and doesn't require any new operators. and heavily caveated. I'm not sure what caveats would be needed. It seems to me that a clear description of what it does would suffice. Like all the other non-default opclasses in core, it will be non-default because it is less frequently useful. This will claim things are different, even when they aren't different when cast to text, or possibly even when extracted in binary mode, ensure this is really what you want is a pretty big caveat, imv. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: In this case even something like AVG() could produce the same result as it did before the update. And you'd surely want to avoid updating the matview if the new value was the same as what was already stored in the matview (but not if it was equal but not the same). BTW, I'm wondering how you're going to explain that, with an optimization of MAX() for matviews that only looks at the records which were updated, that we decided to change 'A' to 'a' on a citext column because they were binary different, yet if you considered all of the rows under the MAX(), 'A' would have nearly always come first and therefore would have been the more likely candidate (and what the user probably would have gotten running the query themselves). Also, as asked, I'll work up an example of how the matview can produce a different result from the regular view in some cases, even with the binary equality stuff. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: I'm trying to figure out what situation you think we're in. Seriously, if you could apply the patch and show one example that demonstrates what you see to be a problem, that would be great. Here's an example to illustrate what I'm talking about when it comes down to you can't claim that you'll produce exactly what the query will always, with these types: =# table citext_table; id | name +--- 1 | one 3 | three 5 | 4 | Three 2 | Three (5 rows) =# create MATERIALIZED VIEW citext_matview AS select name from citext_table where id 0 group by name; SELECT 3 =# table citext_matview; name --- one three (3 rows) =# set enable_seqscan = false; SET =# select name from citext_table where id 0 group by name; name --- one Three (3 rows) Yes, the 'set enable_seqscan = false;' is a bit of a cheat- but I hope you agree that it *shouldn't* change the *result* of a query. There's certainly other cases where a different plan could be picked and result in the same kind of difference between the matview and running the query. Build a matview w/ max(id) as id and a unique index on 'id' and you'll still get the same issue. The problem is that the seqscan will pick up on 'three' first while the index scan will find 'Three' first. This is all with the patch from http://www.postgresql.org/message-id/1379024847.48294.yahoomail...@web162904.mail.bf1.yahoo.com applied. I simply don't believe it's possible to be consistent in this unless we declare the type's equality to be insufficient for us and, everywhere that we call a type's equality operator, *also* check which of the equal values involved in the comparison is bigger or lower (based on binary comparison) and actually pick one, consistently, all the time. This is why I'm complaining that we're trying to paper over a difference that just isn't that simple. I understand why you're trying to- it's definitely annoying, but this isn't so much a solution as it is a hack that doesn't actually work in all cases. I don't have a silver bullet for this but I don't like saying let's implement these binary operators and make things *look* consistent in most cases, and then fail subtly in complex situations. I agree that users may complain if the underlying relation ends up not having *any* entries with the value that's in the matview and a refresh doesn't update it. I expect users will *also* complain if we implement these operators and then they write their own queries using this awesome new 'binary' comparison operator to compare their equal-to-citext strings- and discover that the binary operators say things that *look* the same are actually *different* (thinking encoding issues, overlong encodings, etc). That'd be a subtle and painful bug (technically in their code, not ours, but still) to find, and then what do you do? Argueing against citext (as I've had to do in the past..) would probably be more difficult for some if they saw these operators and figured they could use them. As I mentioned in the thread w/ Robert, when looking at further optimizations down the road, we're going to be in a situation of looking at only a subset of the records and then considering what to do when supporting MAX() efficiently leads us to running 'greater(existing,new)' which returns false, but q and r are binary different. This argument is saying always replace what's there but every other, existing, value in the table might be represented by the 'existing' representation. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On Mon, Sep 23, 2013 at 9:21 PM, Stephen Frost sfr...@snowman.net wrote: Here's an example to illustrate what I'm talking about when it comes down to you can't claim that you'll produce exactly what the query will always, with these types: Your example demonstrates that if a given query can generate two different outputs, A and B, based on the same input data, then the contents of the materialized view cannot be equal to be A and also equal to B. Well, no duh. Of course, you don't need citext, or any other data type with a loose notion of equality, to generate that sort of problem: rhaas=# create table chicken_little (d date); CREATE TABLE rhaas=# insert into chicken_little values ('2012-02-21'); INSERT 0 1 rhaas=# create materialized view henny_penny as select d::text from chicken_little; SELECT 1 rhaas=# table chicken_little; d 2012-02-21 (1 row) rhaas=# table henny_penny; d 2012-02-21 (1 row) rhaas=# set datestyle = 'german'; SET rhaas=# table chicken_little; d 21.02.2012 (1 row) rhaas=# table henny_penny; d 2012-02-21 (1 row) But I'm still wondering what this is intended to prove. There are an infinite number of ways to write queries that produce different results, and I think we all know that materialized views aren't going to hold up very well if given such queries. That seems a poor excuse for not fixing the cases that can be made to work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On Thu, Sep 19, 2013 at 06:31:38PM -0400, Steve Singer wrote: I think there is agreement that better (as in more obscure) operators than === and !== need to be picked and we need to find a place in the user-docs to warn users of the behaviour of this operators. Hannu has proposed *==* binary equal, surely very equal by any other definition as wall !==? maybe not equal -- binary inequal, may still be equal by other comparison methods It's a pity operators must be non-alpha and can't be named. Something like: SELECT foo OPERATOR(byte_equivalent) bar; is simultaneously obscure, yet clear. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: ... like just refreshing a view so that the new contents are the same as what you would see if you re-ran the query defining the matview. I've heard this notion a few times of wanting to make sure that what you get from running the query matches the matview. While that makes sense when the equality operator and what-you-see actually match, it doesn't when they don't because the what-you-see ends up being non-deterministic and can change based on the order the datums are seen in during the query processing which can change with different data ordering on disk and even due to simply different plans for the same data, I believe. Consider a GROUP BY with a citext column as one of the key fields. You're going to get whatever value the aggregate happened to come across first when building the HashAgg. Having the binary equality operator doesn't help that and seems like it could even result in change storms happening due to a different plan when the actual data didn't change. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator - Review
Steve Singer st...@ssinger.info wrote: On 09/12/2013 06:27 PM, Kevin Grittner wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. Here is attempt at a review of the v1 patch. Thanks for looking at it. There has been a heated discussion on list about if we even want this type of operator. It's not a question of whether we want to allow operators which define comparisons between objects in a non-standard way. We already have 11 such cases; this would add one more to make 12. In all cases there are different operators for making a comparison that return potentially different results from the default operators, to support uses that are specific to that type. I think there is agreement that better (as in more obscure) operators than === and !== need to be picked and we need to find a place in the user-docs to warn users of the behaviour of this operators. Hannu has proposed *==* binary equal, surely very equal by any other definition as wall !==? maybe not equal -- binary inequal, may still be equal by other comparison methods and no one has yet objected to this. I do. The issue is that there are a great many places that there are multiple definitions of equality. We generally try to use something which tries to convey something about the nature of the operation, since the fact that it can have different results is a given. There is nothing in those operators that says binary image comparison. I thought about prepending a hash character in front of normal operators, because that somehow suggests binary operations to my eye (although I have no idea whether it does so for anyone else), but those operators are already used for an alternative set of comparisons for intervals, with a different meaning. I'm still trying to come up with something I really like. My vote would be to update the patch to use those operator names and then figure out where we can document them. It it means we have to section describing the equal operator for records as well then maybe we should do that so we can get on with things. Code Review The record_image_eq and record_image_cmp functions are VERY similar. It seems to me that you could have the meet of these functions into a common function (that isn't exposed via SQL) that then behaves differently in 2 or 3 spots based on a parameter that controls if it detoasts the tuple for the compare or returns false on the equals. Did you look at the record_eq and record_cmp functions which exist before this patch? Is there a reason we should do it one way for the default operators and another way for this alternative? Do you think we should change record_eq and record_cmp to do things the way you suggest? Beyond that I don't see any issues with the code. You call out a question about two records being compared have a different number of columns should it always be an error, or only an error when they match on the columns they do have in common. The current behaviour is select * FROM r1,r2 where r1==r2; a | b | a | b | c ---+---+---+---+--- 1 | 1 | 1 | 2 | 1 (1 row) update r2 set b=1; UPDATE 1 test=# select * FROM r1,r2 where r2==r1; ERROR: cannot compare record types with different numbers of columns This seems like a bad idea to me. I don't like that I get a type comparison error only sometimes based on the values of the data in the column. If I'm a user testing code that compares two records with this operator and I test my query with 1 value pair then and it works then I'd naively expect to get a true or false on all other value sets of the same record type. Again, this is a result of following the precedent set by the existing record comparison operators. test=# create table r1 (c1 int, c2 int); CREATE TABLE test=# create table r2 (c1 int, c2 int, c3 int); CREATE TABLE test=# insert into r1 values (1, 2); INSERT 0 1 test=# insert into r2 values (3, 4, 5); INSERT 0 1 test=# select * from r1 join r2 on r1 r2; c1 | c2 | c1 | c2 | c3 ++++ 1 | 2 | 3 | 4 | 5 (1 row) test=# update r1 set c1 = 3, c2 = 4; UPDATE 1 test=# select * from r1 join r2 on r1 r2; ERROR: cannot compare record types with different numbers of columns Would be in favor of forcing a change to the record comparison operators which have been in use for years? If not, is there a good reason to have them behave differently in this regard? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
Steve, Thanks for providing a summary. * Steve Singer (st...@ssinger.info) wrote: The arguments for this patch are * We want the materialized view to return the same value as would be returned if the query were executed directly. This not only means that the values should be the same according to a datatypes = operator but that they should appear the same 'to the eyeball'. With the cases where the equality operator doesn't match the 'to the eyeball' appearance, this really seems to be a pipe dream to me, unless we *also* define an ordering or similar which would then change the existing semantics for end users (which might be reasonable), but I'm not sure how we could do that without input from the type- iow, I don't think we can say well, we'll order by byte representation. It's also going to possibly be a performance hit since we're going to have to do both an equality op call during a HashAgg/HashJoin/whatever and have to do a which one is bigger of these two equal things, and then I wonder if we'd need to allow users to somehow specify I don't want the bigger of the equal things, I want the smaller, in this query for this equality check. I'd really like to see how we're going to provide for that when the user is doing a GROUP BY without breaking something else or causing problems with later SQL spec revisions. * Supporting the materialized view refresh check via SQL makes a lot of sense but doing that requires exposing something via SQL ... which we don't currently expose for the queries that we already support users running today. Users seem to generally be accepting of that too, even though what they end up with in the output isn't necessairly consistent from query to query. The issue here is that we're trying to make the mat view look like what the query would do when run at the same time, which is a bit of a losing battle, imv. * If we adequately document what we mean by record_image_identical and the operator we pick for this then users shouldn't be surprised at what they get if they use this That's a bit over-simplistic, really. We do try to keep to the POLA (principle of least astonishment) and that's not going to be easy here. I think there is agreement that better (as in more obscure) operators than === and !== need to be picked and we need to find a place in the user-docs to warn users of the behaviour of this operators. Hannu has proposed I'm a bit on the fence about this, after having discussed my concerns with Robert at PostgresOpen. If we're going to expose and support these at the SQL level, and we can figure out some semantics and consistency for using them that isn't totally baffling, then perhaps having them as simple and clear operator names would be reasonable. One concern here is if the SQL spec might decide some day that '===' is a useful operator for something else (perhaps even they look the same when cast to text). *==* binary equal, surely very equal by any other definition as wall !==? maybe not equal -- binary inequal, may still be equal by other comparison methods Those look alright to me also though, but we'd need to work out the other operation names, right? And then figure out if we want to use those other operators (less-than, greater-than, etc) when doing equality operations to figure out which equal value to return to the user. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: * Kevin Grittner (kgri...@ymail.com) wrote: ... like just refreshing a view so that the new contents are the same as what you would see if you re-ran the query defining the matview. I've heard this notion a few times of wanting to make sure that what you get from running the query matches the matview. While that makes sense when the equality operator and what-you-see actually match, it doesn't when they don't because the what-you-see ends up being non-deterministic and can change based on the order the datums are seen in during the query processing which can change with different data ordering on disk and even due to simply different plans for the same data, I believe. That's a fair point to some extent. Where notions of equality differ, it is not always non-deterministic, but it can be. For citext you are correct. For a sum() of numeric data, the number of decimal positions will be the largest value seen; the value present in the query results will not vary by order of rows scanned or by plan. The result of this is that with the patch, an incremental refresh of a matview will always match what the query returned at the time it was run (there is no *correctness* problem) but if someone uses a query with non-deterministic results they may have a lot of activity on a concurrent refresh even if there was no change to the underlying data -- so you could have a *performance* penalty in cases where the query returns something different, compared to leaving the old equal but not the same results. Consider a GROUP BY with a citext column as one of the key fields. You're going to get whatever value the aggregate happened to come across first when building the HashAgg. Having the binary equality operator doesn't help that and seems like it could even result in change storms happening due to a different plan when the actual data didn't change. Yup. A person who wants to GROUP BY a citext value for a matview might want to fold it to a consistent capitalization to avoid that issue. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
Stephen Frost sfr...@snowman.net wrote: The issue here is that we're trying to make the mat view look like what the query would do when run at the same time, which is a bit of a losing battle, imv. If we're not doing that, I don't see the point of matviews. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On Friday, September 20, 2013, Kevin Grittner wrote: Stephen Frost sfr...@snowman.net javascript:; wrote: The issue here is that we're trying to make the mat view look like what the query would do when run at the same time, which is a bit of a losing battle, imv. If we're not doing that, I don't see the point of matviews. When taken out of context, I can see how that might not come across correctly. I was merely pointing out that it's a losing battle in the context of types which have equality operators which claim equalness but cast to text with results that don't match there. Thanks, Stephen
Re: [HACKERS] record identical operator - Review
Stephen Frost sfr...@snowman.net wrote: On Friday, September 20, 2013, Kevin Grittner wrote: Stephen Frost sfr...@snowman.net wrote: The issue here is that we're trying to make the mat view look like what the query would do when run at the same time, which is a bit of a losing battle, imv. If we're not doing that, I don't see the point of matviews. When taken out of context, I can see how that might not come across correctly. I was merely pointing out that it's a losing battle in the context of types which have equality operators which claim equalness but cast to text with results that don't match there. I think the patch I've submitted wins that battle. The only oddity is that if someone uses a query for a matview which can provide results with rows which are equal to previous result rows under the default record opclass but different under this patch's opclass, RMVC could update to the latest query results when someone thinks that might not be necessary. Workarounds would include using a query with deterministic results (like using the upper() or lower() function on a grouped citext column in the result set) or not using the CONCURRENTLY option. Neither seems too onerous. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: The result of this is that with the patch, an incremental refresh of a matview will always match what the query returned at the time it was run (there is no *correctness* problem) but if someone uses a query with non-deterministic results they may have a lot of activity on a concurrent refresh even if there was no change to the underlying data -- so you could have a *performance* penalty in cases where the query returns something different, compared to leaving the old equal but not the same results. You mean 'at the time of the incremental refresh', right? Yet that may or may not match running that query directly by an end-user as the plan that a user might get for the entire query could be different than what is run for an incremental update, or due to statistics updates, etc. Consider a GROUP BY with a citext column as one of the key fields. You're going to get whatever value the aggregate happened to come across first when building the HashAgg. Having the binary equality operator doesn't help that and seems like it could even result in change storms happening due to a different plan when the actual data didn't change. Yup. A person who wants to GROUP BY a citext value for a matview might want to fold it to a consistent capitalization to avoid that issue. I'm trying to figure out why that's a perfectly acceptable solution for users running views with GROUP BYs, but apparently it isn't sufficient for mat views? In other words, you would suggest telling users sorry, you can't rely on the value returned by a GROUP BY on that citext column using a normal view, but we're going to try and do better for mat views. I don't intend the above to imply that we should never update values in mat views when we can do so in a reliable way to ensure that the value matches what a view would return. This matches our notion of UPDATE, where we will still UPDATE a value even if the old value and the new value are equal according to the type's equality operator, when the conditional for the UPDATE is using a reliable type (eg: integer). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On 2013-09-20 10:51:46 -0400, Stephen Frost wrote: I'm trying to figure out why that's a perfectly acceptable solution for users running views with GROUP BYs, but apparently it isn't sufficient for mat views? Err, because users wrote a GROUP BY? They haven't (neccessarily) in the cases of the matviews we're talking about? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: * Kevin Grittner (kgri...@ymail.com) wrote: The result of this is that with the patch, an incremental refresh of a matview will always match what the query returned at the time it was run (there is no *correctness* problem) but if someone uses a query with non-deterministic results they may have a lot of activity on a concurrent refresh even if there was no change to the underlying data -- so you could have a *performance* penalty in cases where the query returns something different, compared to leaving the old equal but not the same results. You mean 'at the time of the incremental refresh', right? Yet that may or may not match running that query directly by an end-user as the plan that a user might get for the entire query could be different than what is run for an incremental update, or due to statistics updates, etc. I'm confused. The refresh *does* run the query. Sure, if the query is run again it could return different results. I'm missing the point here. Consider a GROUP BY with a citext column as one of the key fields. You're going to get whatever value the aggregate happened to come across first when building the HashAgg. Having the binary equality operator doesn't help that and seems like it could even result in change storms happening due to a different plan when the actual data didn't change. Yup. A person who wants to GROUP BY a citext value for a matview might want to fold it to a consistent capitalization to avoid that issue. I'm trying to figure out why that's a perfectly acceptable solution for users running views with GROUP BYs, but apparently it isn't sufficient for mat views? In other words, you would suggest telling users sorry, you can't rely on the value returned by a GROUP BY on that citext column using a normal view, but we're going to try and do better for mat views. Again, I'm lost. If they don't do something to make the result deterministic, it could be different on each run of the VIEW, and on each REFRESH of the matview. I don't see why that is an argument for trying to suppress the effort needed make the matview match the latest run of the query. I don't intend the above to imply that we should never update values in mat views when we can do so in a reliable way to ensure that the value matches what a view would return. This matches our notion of UPDATE, where we will still UPDATE a value even if the old value and the new value are equal according to the type's equality operator, when the conditional for the UPDATE is using a reliable type (eg: integer). Well, we provide a trigger function to suppress the UPDATE operation if the old and new values are identical -- in terms of what is stored. We do not attempt to use the default btree equals operator to suppress updates to different values in some circumstances. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-09-20 10:51:46 -0400, Stephen Frost wrote: I'm trying to figure out why that's a perfectly acceptable solution for users running views with GROUP BYs, but apparently it isn't sufficient for mat views? Err, because users wrote a GROUP BY? They haven't (neccessarily) in the cases of the matviews we're talking about? Sure; my thinking was going back to what Hannu had suggested where we have a mechanism to see if the value was updated (using xmin or similar) and then update it in the mat view in that case, without actually doing a comparison at all. That wouldn't necessairly work for the GROUP BY case, but that situation doesn't work reliably today anyway. If we solve the GROUP BY case in SQL for these types then we could use the same for mat views, but we've been happy enough to ignore the issue thus far. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On 2013-09-20 11:05:06 -0400, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: On 2013-09-20 10:51:46 -0400, Stephen Frost wrote: I'm trying to figure out why that's a perfectly acceptable solution for users running views with GROUP BYs, but apparently it isn't sufficient for mat views? Err, because users wrote a GROUP BY? They haven't (neccessarily) in the cases of the matviews we're talking about? Sure; my thinking was going back to what Hannu had suggested where we have a mechanism to see if the value was updated (using xmin or similar) and then update it in the mat view in that case, without actually doing a comparison at all. VACUUM, HOT pruning. Have fun. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: Stephen Frost sfr...@snowman.net wrote: You mean 'at the time of the incremental refresh', right? Yet that may or may not match running that query directly by an end-user as the plan that a user might get for the entire query could be different than what is run for an incremental update, or due to statistics updates, etc. I'm confused. The refresh *does* run the query. Sure, if the query is run again it could return different results. I'm missing the point here. Perhaps I'm assuming things are farther along than they are.. I was assumed that 'incremental mat view' updates were actuallly 'partial'- iow, that it keeps track of the rows in the mat view which are impacted by the set of rows in the base relations and then runs specific queries to pull out and operate on the base-rel set-of-rows to update the specific mat-view-rows. If it's running the whole query again then it's certainly more likely to get the same results that the user did, but it's not a guarantee and that's only a happy coincidence while we're still doing the whole query approach (which I hope we'd eventually progress beyond..). I'm trying to figure out why that's a perfectly acceptable solution for users running views with GROUP BYs, but apparently it isn't sufficient for mat views? In other words, you would suggest telling users sorry, you can't rely on the value returned by a GROUP BY on that citext column using a normal view, but we're going to try and do better for mat views. Again, I'm lost. Perhaps my reply to Andres will help clear that up? If they don't do something to make the result deterministic, it could be different on each run of the VIEW, and on each REFRESH of the matview. I don't see why that is an argument for trying to suppress the effort needed make the matview match the latest run of the query. Is this really just run the new query and look if the old and new row are different wrt 'incremental' view updates? If so, I think we're trying to design something here that we're going to totally break very shortly when we move beyond that kind of sledgehammer approach to incremental mat view updates and I'd rather we figure out a solution which will work beyond that.. I don't intend the above to imply that we should never update values in mat views when we can do so in a reliable way to ensure that the value matches what a view would return. This matches our notion of UPDATE, where we will still UPDATE a value even if the old value and the new value are equal according to the type's equality operator, when the conditional for the UPDATE is using a reliable type (eg: integer). Well, we provide a trigger function to suppress the UPDATE operation if the old and new values are identical -- in terms of what is stored. We do not attempt to use the default btree equals operator to suppress updates to different values in some circumstances. This feels like we're trying to figure out how to create a key for a whole row based on the binary representation of that row and then use that to check if we should update a given row or not. This seems a bit radical, but perhaps that's what we should just do then, rather than invent binary equivilance operators for what-things-look-like-now for this- extract the row columns out using their binary _send functions and then hash the result to build a key that you can use. At least then we're using a documented (well, we could probably improve on this, but still) and stable representation of the data that at least some of our users already deal with. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Andres Freund (and...@2ndquadrant.com) wrote: On 2013-09-20 11:05:06 -0400, Stephen Frost wrote: Sure; my thinking was going back to what Hannu had suggested where we have a mechanism to see if the value was updated (using xmin or similar) and then update it in the mat view in that case, without actually doing a comparison at all. VACUUM, HOT pruning. Have fun. Yea, clearly oversimplified, but I do expect that we're going to reach a point where we're looking at the rows being updated in the base rels and which rows they map to in the view and then marking those rows as needing to be updated. That whole mechanism doesn't depend on this are-they-binary-equal approach and is what I had anticipated as the path we'd be going down in the future. The above is also what I recall had been discussed at the hackers meeting, along with some ideas/papers about how to specifically implement partial updates, hence my assumption that was what we were talking about.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator - Review
On 09/20/2013 08:38 AM, Kevin Grittner wrote: Did you look at the record_eq and record_cmp functions which exist before this patch? Is there a reason we should do it one way for the default operators and another way for this alternative? Do you think we should change record_eq and record_cmp to do things the way you suggest? I think the record_eq and record_cmp functions would be better if they shared code as well, but I don't think changing that should be part of this patch, you aren't otherwise touching those functions. I know some people dislike code that is switch based and prefer duplication, my preference is to avoid duplication. This seems like a bad idea to me. I don't like that I get a type comparison error only sometimes based on the values of the data in the column. If I'm a user testing code that compares two records with this operator and I test my query with 1 value pair then and it works then I'd naively expect to get a true or false on all other value sets of the same record type. Again, this is a result of following the precedent set by the existing record comparison operators. test=# create table r1 (c1 int, c2 int); CREATE TABLE test=# create table r2 (c1 int, c2 int, c3 int); CREATE TABLE test=# insert into r1 values (1, 2); INSERT 0 1 test=# insert into r2 values (3, 4, 5); INSERT 0 1 test=# select * from r1 join r2 on r1 r2; c1 | c2 | c1 | c2 | c3 ++++ 1 | 2 | 3 | 4 | 5 (1 row) test=# update r1 set c1 = 3, c2 = 4; UPDATE 1 test=# select * from r1 join r2 on r1 r2; ERROR: cannot compare record types with different numbers of columns Would be in favor of forcing a change to the record comparison operators which have been in use for years? If not, is there a good reason to have them behave differently in this regard? -- I hadn't picked up on that you copied that part of the behaviour from the existing comparison operators. No we would need a really good reason for changing the behaviour of the comparison operators and I don't think we have that. I agree that the binary identical operators should behave similarly to the existing comparison operators and error out on the column number mismatch after comparing the columns that are present in both. Steve Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Dimitri Fontaine escribió: My understanding is that if you choose citext then you don't care at all about the case, so that the two relations actually yield the same results for the right definition of same here: the citext one. For the record, I don't think citext means that the user doesn't care about the case; it only means they want the comparisons to be case-insensitive, but case should nonetheless be preserved. That is, case-insensitive, case-preserving. A parallel is MS-DOS file name semantics. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Kevin Grittner kgri...@ymail.com writes: There are examples in the patch and this thread, but rather than reference back to those I'll add a new one. Without the patch: Thanks much for doing that. The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case, so that the two relations actually yield the same results for the right definition of same here: the citext one. In other words, the results only look different in ways that don't matter for the datatype involved, and I think that if it matters to the user then he needs to review is datatype choices or view definition. So my position on that is that your patch is only adding confusion for no benefits that I'm able to understand. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/19/2013 10:54 AM, Dimitri Fontaine wrote: Kevin Grittner kgri...@ymail.com writes: There are examples in the patch and this thread, but rather than reference back to those I'll add a new one. Without the patch: Thanks much for doing that. The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case, so that the two relations actually yield the same results for the right definition of same here: the citext one. In other words, the results only look different in ways that don't matter for the datatype involved, and I think that if it matters to the user then he needs to review is datatype choices or view definition. So my position on that is that your patch is only adding confusion for no benefits that I'm able to understand. The aim is to have a view and materialized view return the same results. If they do not, the user is guaranteed to be confused and to consider the matview implementation broken the patch solves the general problem of when the table changes, refresh After saying it like this, the problem could also be solved by including xmin(s) for rows from underlying table(s)in the matview. Would this be a better approach ? -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Kevin Grittner kgri...@ymail.com writes: The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case That's not my understanding. If that was what citext was for it would be much simpler to force the case in creating each value. It *preserves* the case for display, but ignores it for comparisons. That's the contract of the type, like it or not. Equal does not mean the same. They clearly want to preserve and display differences among equal values. Streaming replication would preserve the differences. pg_dump and restore of the data would preserve the differences. SELECTing the data shows the differences. suppress_redundant_updates_trigger() will not suppress an UPDATE setting an equal value unless it is also *the same*. A normal VIEW shows the differences. RMV without CONCURRENTLY will show the difference. I'm suggesting that RMVC and the upcoming incremental update of matviews should join this crowd. A matview which is being refreshed with the CONCURRENTLY option, or maintained by incremental maintenance should not look different from a regular VIEW, and should not suddenly look completely different after a non-concurrent REFRESH. If things are left that way, people will quite justifiably feel that matviews are broken. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-19 05:33:22 -0700, Kevin Grittner wrote: Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Kevin Grittner kgri...@ymail.com writes: The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case That's not my understanding. If that was what citext was for it would be much simpler to force the case in creating each value. It *preserves* the case for display, but ignores it for comparisons. That's the contract of the type, like it or not. Equal does not mean the same. They clearly want to preserve and display differences among equal values. I agree. I am not 100% sure if the can of worms this opens is worth the trouble, but from my POV it's definitely an understandable and sensible goal. My complaints about this subfeature were never about trying to get that right for matviews... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Hannu Krosing ha...@2ndquadrant.com wrote: the patch solves the general problem of when the table changes, refresh After saying it like this, the problem could also be solved by including xmin(s) for rows from underlying table(s)in the matview. Would this be a better approach ? Now you're moving from REFRESH territory into incremental maintenance. There have been a very large number of papers written on that topic, I have reviewed the literature, and I have picked a technique which looks like it will be fast and reliable -- based on relational algebra. Let's save discussion of alternatives such as you're suggesting here, for when I get past the easy stuff ... like just refreshing a view so that the new contents are the same as what you would see if you re-ran the query defining the matview. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/19/2013 02:41 PM, Kevin Grittner wrote: Hannu Krosing ha...@2ndquadrant.com wrote: the patch solves the general problem of when the table changes, refresh After saying it like this, the problem could also be solved by including xmin(s) for rows from underlying table(s)in the matview. Would this be a better approach ? Now you're moving from REFRESH territory into incremental maintenance. There have been a very large number of papers written on that topic, I have reviewed the literature, and I have picked a technique which looks like it will be fast and reliable -- based on relational algebra. Let's save discussion of alternatives such as you're suggesting here, for when I get past the easy stuff ... like just refreshing a view so that the new contents are the same as what you would see if you re-ran the query defining the matview. I'm sure that comparing xmin records would work exactly similar to binary comparisons of records for detecting possible change in 99.999% of real-world cases. I am also pretty sure it would have its own cans of worms all over again when trying to actually implement this in matviews :) Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Wed, Sep 18, 2013 at 7:29 PM, Vik Fearing vik.fear...@dalibo.com wrote: On 09/19/2013 12:55 AM, Dimitri Fontaine wrote: We have, as a community, gone to a fair amount of trouble to make the concept of equality pluggable and allow multiple types of equality per type. To me it seems the perfect tool to solve this problem. Why the fuss? Because I don't understand why you need another equality than the default btree one, certainly. Basically because 'word'::citext and 'Word'::citext are the same to your brain but not to your eyeballs. Unique indexes, for example, only need to please your brain. Matviews need to please your eyeballs. Right, and well said. It's perfectly reasonable to want a unique index that doesn't allow both Kevin O'leary and Kevin O'Leary to be listed in the irish_names table, but it's also perfectly reasonable to want to remember that the user who entered the data spelled it the second way and not the first. And it's also reasonable to want any corrections made to the table to propagate to any materialized views defined over it. The fact that we have multiple concepts of equality can be confusing, but it's not for no reason, and we're not the only database or programming language to have such a concept. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator - Review
On 09/12/2013 06:27 PM, Kevin Grittner wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. Here is attempt at a review of the v1 patch. There has been a heated discussion on list about if we even want this type of operator. I will try to summarize it here The arguments against it are * that a record_image_identical call on two records that returns false can still return true under the equality operator, and the records might (or might not) appear to be the same to users * Once we expose this as an operator via SQL someone will find it and use it and then complain when we change things such as the internal representation of a datatype which might break there queries * The differences between = and record_image_identical might not be understood by everywhere who finds the operator leading to confusion * Using a criteria other than equality (the = operator provided by the datatype) might cause problems if we later provide 'on change' triggers for materialized views The arguments for this patch are * We want the materialized view to return the same value as would be returned if the query were executed directly. This not only means that the values should be the same according to a datatypes = operator but that they should appear the same 'to the eyeball'. * Supporting the materialized view refresh check via SQL makes a lot of sense but doing that requires exposing something via SQL * If we adequately document what we mean by record_image_identical and the operator we pick for this then users shouldn't be surprised at what they get if they use this I think there is agreement that better (as in more obscure) operators than === and !== need to be picked and we need to find a place in the user-docs to warn users of the behaviour of this operators. Hannu has proposed *==* binary equal, surely very equal by any other definition as wall !==? maybe not equal -- binary inequal, may still be equal by other comparison methods and no one has yet objected to this. My vote would be to update the patch to use those operator names and then figure out where we can document them. It it means we have to section describing the equal operator for records as well then maybe we should do that so we can get on with things. Code Review The record_image_eq and record_image_cmp functions are VERY similar. It seems to me that you could have the meet of these functions into a common function (that isn't exposed via SQL) that then behaves differently in 2 or 3 spots based on a parameter that controls if it detoasts the tuple for the compare or returns false on the equals. Beyond that I don't see any issues with the code. You call out a question about two records being compared have a different number of columns should it always be an error, or only an error when they match on the columns they do have in common. The current behaviour is select * FROM r1,r2 where r1==r2; a | b | a | b | c ---+---+---+---+--- 1 | 1 | 1 | 2 | 1 (1 row) update r2 set b=1; UPDATE 1 test=# select * FROM r1,r2 where r2==r1; ERROR: cannot compare record types with different numbers of columns This seems like a bad idea to me. I don't like that I get a type comparison error only sometimes based on the values of the data in the column. If I'm a user testing code that compares two records with this operator and I test my query with 1 value pair then and it works then I'd naively expect to get a true or false on all other value sets of the same record type. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Tue, Sep 17, 2013 at 8:23 AM, Kevin Grittner kgri...@ymail.com wrote: To have clean semantics, I think the operator should mean that the stored format of the row is the same. Regarding the array null bitmap example, I think it would be truly weird if the operator said that the stored format was the same, but this happened: test=# select pg_column_size(ARRAY[1,2,3]); pg_column_size 36 (1 row) test=# select pg_column_size((ARRAY[1,2,3,NULL])::int4[3]); pg_column_size 44 (1 row) They have the same stored format, but a different number of bytes?!? Hmm. For most of this thread, I was leaning toward the view that comparing the binary representations was the wrong concept, and that we actually needed to have type-specific operators that understand the semantics of the data type. But I think this example convinces me otherwise. What we really want to do here is test whether two values are the same, and if you can feed two values that are supposedly the same to some function and get two different answers, well then they're not really the same, are they? Now, I grant that the array case is pretty weird. An array with an all-zeroes null bitmap is basically semantically identical to one with no null bitmap at all. But there are other such cases as well. You can have two floats that print the same way except when extra_float_digits=3, for example, and I think that's probably a difference that we *wouldn't* want to paper over. You can have a long-form numeric that represents a value that could have been represented as a short-form numeric, which is similar to the array case. There are probably other examples as well. But in each of those cases, the point is that there *is* some operation which will distinguish between the two supposedly-identical values, and therefore they are not identical for all purposes. Therefore, I see no harm in having an operator that tests for are-these-values-identical-for-all-purposes. If that's useful for RMVC, then there may be other legitimate uses for it as well. And once we decide that's OK, I think we ought to document it. Sure, it's a little confusing, but we can explain it, I think. It's a good opportunity to point out to people that, most of the time, they really want something else, like the equality operator for the default btree opclass. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Robert Haas (robertmh...@gmail.com) wrote: Therefore, I see no harm in having an operator that tests for are-these-values-identical-for-all-purposes. If that's useful for RMVC, then there may be other legitimate uses for it as well. And once we decide that's OK, I think we ought to document it. Sure, it's a little confusing, but we can explain it, I think. It's a good opportunity to point out to people that, most of the time, they really want something else, like the equality operator for the default btree opclass. For my 2c on this, while this can be useful for *us*, and maybe folks hacking pretty close to PG, I can't get behind introducing this as an '===' or some such operator. I've missed why this can't be a simple function and why in the world we would want to encourage users to use this by making it look like a normal language construct of SQL, which damn well better consider numbers which are equal in value to be equal, regardless of their representation. What the heck is the use case for this being a user-oriented, SQL operator..? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On 2013-09-18 11:06:13 -0500, Merlin Moncure wrote: Ugh. This feels like a pretty ugly hack to deal with that. I haven't got any magical wand to address it, but making an SQL operator for 'are these really the same bytes' to deal with what is essentially implementation detail is _very_ grotty. I know the feeling, but I don't have a better suggestion either, so... Having matviews using SQL expressible features is a *good* thing. Having a user accessible operator is nice to have (if for no other reason than to allow testing for which matview rows would be refreshed). I just don't understand what all the fuss is about except to make sure not to utilize an operator name that is better suited for other purposes. It's an externally exposed API with not easily understandable semantics that's not actually all that useful outside specific usecases. If we decide to change it we're creating an API breakage. And we get to deal with people saying it's broken because they don't understand the semantics. That said, I am ok with this if we use strange operator names and document that the semantics are complicated... ==!! ==!!= ... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-18 11:50:23 -0400, Stephen Frost wrote: For my 2c on this, while this can be useful for *us*, and maybe folks hacking pretty close to PG, I can't get behind introducing this as an '===' or some such operator. I've missed why this can't be a simple function and why in the world we would want to encourage users to use this by making it look like a normal language construct of SQL, which damn well better consider numbers which are equal in value to be equal, regardless of their representation. I certainly understand the feeling... I think this really needs to have an obscure name. Like ==!!== or somesuch (is equal very much, but doesn't actually test for equality ;)) What the heck is the use case for this being a user-oriented, SQL operator..? The materalized view code uses generated SQL, so it has to be SQL accessible. And it needs to be an operator because the join planning code requires that :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: making an SQL operator for 'are these really the same bytes' to deal with what is essentially implementation detail is _very_ grotty. We already have some such operators, although Andres argues that comparing to that isn't fair because we at least know it is a string of characters; we're just ignoring character boundaries and collations. Some of the operators use for the existing byte comparison opclasses are: ~~ ~=~ ~=~ ~~ Go ahead and try them out with existing text values. Andres has said that he has seen these used in production systems. = and aren't listed above even though they do a byte-for-byte comparison because, well, I guess because we have chosen to treat two UTF8 strings which produce the same set of glyphs using different bytes as unequal. :-/ -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kgri...@ymail.com wrote: To have clean semantics, I think the operator should mean that the stored format of the row is the same. Regarding the array null bitmap example, I think it would be truly weird if the operator said that the stored format was the same, but this happened: test=# select pg_column_size(ARRAY[1,2,3]); pg_column_size 36 (1 row) test=# select pg_column_size((ARRAY[1,2,3,NULL])::int4[3]); pg_column_size 44 (1 row) They have the same stored format, but a different number of bytes?!? Hmm. For most of this thread, I was leaning toward the view that comparing the binary representations was the wrong concept, and that we actually needed to have type-specific operators that understand the semantics of the data type. But I think this example convinces me otherwise. What we really want to do here is test whether two values are the same, and if you can feed two values that are supposedly the same to some function and get two different answers, well then they're not really the same, are they? Right. Not only would the per-type solution make materialized views maintenance broken by default, requiring per-type work to make it work reasonably, with silent failures for any type you didn't know about, but no user-visible differences is a pretty slippery concept. Did you really think of all the functions someone might use to look at a value? Might there be performance differences we care about that should be handled, even if the user has no way to dig out the difference? Will that change in a future release? Now, I grant that the array case is pretty weird. An array with an all-zeroes null bitmap is basically semantically identical to one with no null bitmap at all. But there are other such cases as well. You can have two floats that print the same way except when extra_float_digits=3, for example, and I think that's probably a difference that we *wouldn't* want to paper over. You can have a long-form numeric that represents a value that could have been represented as a short-form numeric, which is similar to the array case. There are probably other examples as well. But in each of those cases, the point is that there *is* some operation which will distinguish between the two supposedly-identical values, and therefore they are not identical for all purposes. Therefore, I see no harm in having an operator that tests for are-these-values-identical-for-all-purposes. If that's useful for RMVC, then there may be other legitimate uses for it as well. And once we decide that's OK, I think we ought to document it. That seems to be the consensus. I don't think we can really document this form of record comparison without also documenting how equality works. I'll work something up for the next version of the patch. Sure, it's a little confusing, but we can explain it, I think. It's a good opportunity to point out to people that, most of the time, they really want something else, like the equality operator for the default btree opclass. I think the hardest part will be documenting the difference between the row value constructor semantics (which are all that is currently documented) and the record equality semantics (used for sorting and building indexes). In a green field I think I would have argued for having just the standard semantics we have documented, and modifying our sort execution nodes and index builds to deal with that. This is one of those cases where the breakage from changing to that is hard to justify on a cleaner conceptual semantics basis. There also seems to be universal agreement that the operator names should be something other than what I put in the v1 patch, but we don't have agreement on what should be used instead. We need six operators, to support the btree am requirements. Currently the patch has: === !== == == Suggested same as operators so far are: = = == Anyone want to champion one of those, or something else? How about the other five operators to go with your favorite? Keep in mind that this thread has also turned up strong support for an operator to express IS NOT DISTINCT FROM -- so that it can be used with ANY/ALL, among other things. Long term, having an opfamily for that might help us clean up the semantics of record comparison when there are NULLs involved. Currently we use the = operator but act as though IS NOT DISTINCT FROM was specified (except for some cases involving a row value constructor). Any serious discussion of that should probably move to a new thread, but I mention it here because some people wanted to reserve operator space for that, which could conflict with same as operators. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: Right. Not only would the per-type solution make materialized views maintenance broken by default, requiring per-type work to make it work reasonably, with silent failures for any type you didn't know about, but no user-visible differences is a pretty slippery concept. I don't like those possibilities, of course, but I'm starting to wonder about this whole concept of looking at it byte-wise. If I'm following correctly, what we're looking at here is having a way for matviews to tell if these bytes are the same as those bytes, for the purpose of deciding to update the matview, right? Yet we can then have cases where the row isn't *actually* different from a value perspective, yet we're going to update it anyway because it's represented slightly differently? What happens if we later want to add support for users to have a matview trigger that's called when a matview row *actually* changes? We'd end up calling it on what are essentially false positives, or having to do some double-check later on well, did it *really* change?, neither of which is good at all. If we had the IS NOT DISTINCT FROM operators discussed, would that work for this even if it isn't as performant? Or is there an issue with that? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On Wed, Sep 18, 2013 at 10:59 AM, Stephen Frost sfr...@snowman.net wrote: * Andres Freund (and...@2ndquadrant.com) wrote: I think this really needs to have an obscure name. Like ==!!== or somesuch (is equal very much, but doesn't actually test for equality ;)) hah. What the heck is the use case for this being a user-oriented, SQL operator..? The materalized view code uses generated SQL, so it has to be SQL accessible. And it needs to be an operator because the join planning code requires that :( Ugh. This feels like a pretty ugly hack to deal with that. I haven't got any magical wand to address it, but making an SQL operator for 'are these really the same bytes' to deal with what is essentially implementation detail is _very_ grotty. Having matviews using SQL expressible features is a *good* thing. Having a user accessible operator is nice to have (if for no other reason than to allow testing for which matview rows would be refreshed). I just don't understand what all the fuss is about except to make sure not to utilize an operator name that is better suited for other purposes. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Andres Freund (and...@2ndquadrant.com) wrote: I think this really needs to have an obscure name. Like ==!!== or somesuch (is equal very much, but doesn't actually test for equality ;)) hah. What the heck is the use case for this being a user-oriented, SQL operator..? The materalized view code uses generated SQL, so it has to be SQL accessible. And it needs to be an operator because the join planning code requires that :( Ugh. This feels like a pretty ugly hack to deal with that. I haven't got any magical wand to address it, but making an SQL operator for 'are these really the same bytes' to deal with what is essentially implementation detail is _very_ grotty. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Merlin Moncure (mmonc...@gmail.com) wrote: Having matviews using SQL expressible features is a *good* thing. Fine, then it should be implemented *using SQL*, which is based on *values*, not on how the value is represented in bits and bytes. Having a user accessible operator is nice to have (if for no other reason than to allow testing for which matview rows would be refreshed). If it's not actually *changing* (wrt its value), then I'm not at all impressed with the notion that it's going to get updated anyway. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
* Kevin Grittner (kgri...@ymail.com) wrote: = and aren't listed above even though they do a byte-for-byte comparison because, well, I guess because we have chosen to treat two UTF8 strings which produce the same set of glyphs using different bytes as unequal. :-/ I tend to side with Andres on this case actually- we're being asked to store specific UTF8 bytes by the end user. That is not the same as treating two different numerics which are the same *number* as different because they have different binary representations, which is entirely an internal-to-postgres consideration. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
On 09/18/2013 11:39 AM, Stephen Frost wrote: * Kevin Grittner (kgri...@ymail.com) wrote: = and aren't listed above even though they do a byte-for-byte comparison because, well, I guess because we have chosen to treat two UTF8 strings which produce the same set of glyphs using different bytes as unequal. :-/ I tend to side with Andres on this case actually- we're being asked to store specific UTF8 bytes by the end user. That is not the same as treating two different numerics which are the same *number* as different because they have different binary representations, which is entirely an internal-to-postgres consideration. The problem is that there are datatypes (citext, postgis,...) that have defined = to return true when comparing two values that are different not just stored differently. Are you saying that matview's should update only when the = operator of the datatype returns false and if people don't like this behaviour they should fix the datatypes? Thanks, Stephen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
* Steve Singer (st...@ssinger.info) wrote: The problem is that there are datatypes (citext, postgis,...) that have defined = to return true when comparing two values that are different not just stored differently. If the definition of the type is that they're equal, then they're equal. Certainly there are cases where this is really rather broken (particularly in the postgis case that you mention), but I don't think that means we should change our definition of equality to generally be are the bytes the same- clearly that'd lead to incorrect behavior in the NUMERIC case. Are you saying that matview's should update only when the = operator of the datatype returns false and if people don't like this behaviour they should fix the datatypes? imv, we are depending on the = operator to tell us when the values are equal, regardless of type. I have a hard time seeing how we can do anything else. The PostGIS situation is already broken when you consider UNIQUE constraints and, while it's unfortunate that they'd need to change their data type to fix that, I do feel it's on them to deal with it. Anyone can create an extension with their own data type which returns wrong data and results, it's not on us to figure out how to make those work even in the face of blatent violations like making = not actually mean these values are the same. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: If it's not actually *changing* (wrt its value), then I'm not at all impressed with the notion that it's going to get updated anyway. But PostgreSQL very specifically (and as far as I can tell *intentionally*) allows you to *change* a value and have it still be considered *equal*. The concept of equal values really means more like equivalent or close enough for common purposes. It very specifically does *not* mean the same value. As just one example, think how much easier the citext type would be to implement if it folded all values to lower case as they were input, rather than preserving the data as entered and considering different capitalizations as equal. The notion that in PostgreSQL a value has not changed if the new value is equal to the old is just flat out wrong. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Kevin, On Wednesday, September 18, 2013, Kevin Grittner wrote: Stephen Frost sfr...@snowman.net javascript:; wrote: If it's not actually *changing* (wrt its value), then I'm not at all impressed with the notion that it's going to get updated anyway. But PostgreSQL very specifically (and as far as I can tell *intentionally*) allows you to *change* a value and have it still be considered *equal*. I'm curious where you're going with that- of course you can update a value and have the same value (and possibly the same byte representation) stored over the old. The concept of equal values really means more like equivalent or close enough for common purposes. It very specifically does *not* mean the same value. I'm really curious about your thoughts on unique indexes then. Should two numerics which are the same value but different byte representations be allowed in a unique index? As just one example, think how much easier the citext type would be to implement if it folded all values to lower case as they were input, rather than preserving the data as entered and considering different capitalizations as equal. If the type operator says they're equal, then I think we need to consider them as equal. If an update happens with a conditional of: where col1 = 'Abc' When col1 is 'ABC' using citext, should we still issue the update? The notion that in PostgreSQL a value has not changed if the new value is equal to the old is just flat out wrong. The value *can* be changed to be equal to the existing value but that doesn't make the two values *not equal*. Thanks, Stephen
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net wrote: I don't think that means we should change our definition of equality to generally be are the bytes the same- clearly that'd lead to incorrect behavior in the NUMERIC case. Nobody is talking in any way, shape, or form about changing our concept of what is equal. We're talking about recognizing that in PostgreSQL equal does *not* mean the same. If we used the equal concept for determining what has changed, if someone was tracking numeric data without precision and scale so that they could track accuracy (by storing the correct number of decimal positions) the accuracy could not be replicated to a materialized view. Of course, streaming replication would replicate the change, but if '1.4' was stored in a column copied into a matview and they later updated the source to '1.40' the increase in accuracy would not flow to the matview. That would be a bug, not a feature. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/18/2013 05:53 PM, Andres Freund wrote: On 2013-09-18 11:50:23 -0400, Stephen Frost wrote: For my 2c on this, while this can be useful for *us*, and maybe folks hacking pretty close to PG, I can't get behind introducing this as an '===' or some such operator. I've missed why this can't be a simple function and why in the world we would want to encourage users to use this by making it look like a normal language construct of SQL, which damn well better consider numbers which are equal in value to be equal, regardless of their representation. I certainly understand the feeling... I think this really needs to have an obscure name. Like ==!!== or somesuch (is equal very much, but doesn't actually test for equality ;)) In PostgreSQL equality can be anything :) In other words, we have pluggable equality, so it is entirely feasible to have an opclass where binary equality is *the* equality the problem started with some opclass equality (case insensitive comparison) missing user-visible changes. Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Stephen Frost sfr...@snowman.net Kevin Grittner wrote: Stephen Frost sfr...@snowman.net wrote: If it's not actually *changing* (wrt its value), then I'm not at all impressed with the notion that it's going to get updated anyway. But PostgreSQL very specifically (and as far as I can tell *intentionally*) allows you to *change* a value and have it still be considered *equal*. I'm curious where you're going with that- of course you can update a value and have the same value (and possibly the same byte representation) stored over the old. The way I see it, you can update a column to a different value which will compare as equal. That's fine. Nobody wants to change that. But it is still not the same value. The concept of equal values really means more like equivalent or close enough for common purposes. It very specifically does *not* mean the same value. I'm really curious about your thoughts on unique indexes then. Should two numerics which are the same value but different byte representations be allowed in a unique index? Not if it is defined with the default opclass, which will use an equal operator. Of course, this patch would allow an index on a record to be defined with record_image_ops, in which case it would sort by the raw bytes in the values of the record. That's not going to be useful in very many places, which is why it would not be the default. You don't get that behavior unless you ask for it. See this docs page for a similar example related to complex numbers: http://www.postgresql.org/docs/current/interactive/xindex.html#XINDEX-EXAMPLE If the type operator says they're equal, then I think we need to consider them as equal. Absolutely. Two different values may be equal within an opclass. If an update happens with a conditional of: where col1 = 'Abc' When col1 is 'ABC' using citext, should we still issue the update? Absolutely not, because the update was requested in the case that the equality test was true. Yet if a row is updated to replace 'Abc' with 'ABC', then streaming replication should copy the different but equal value (it does), a normal view should now show 'ABC' (it does), and a refresh of a matview should cause the matview to show 'ABC' (it doesn't in git, but this patch would make that work). The value *can* be changed to be equal to the existing value but that doesn't make the two values *not equal*. Nobody has ever argued that they should be considered *not equal*. It's just about providing a way to recognize when two equal values *are not the same*. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Kevin Grittner kgri...@ymail.com wrote: If an update happens with a conditional of: where col1 = 'Abc' When col1 is 'ABC' using citext, should we still issue the update? Absolutely not, because the update was requested in the case that the equality test was true. Sorry, as if this thread were not long enough, I misread that and gave the wrong answer. Yes, the equal operator was used and the equal operator for two citext values says those are equal, so the row *should* be updated. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Kevin Grittner kgri...@ymail.com writes: change, but if '1.4' was stored in a column copied into a matview and they later updated the source to '1.40' the increase in accuracy would not flow to the matview. That would be a bug, not a feature. Maybe the answer to that use case is to use the seg extension? http://www.postgresql.org/docs/9.3/interactive/seg.html IOW, colour me unconvinced about that binary-equality opclass use case in MatViews. We are trusting the btree equality operator about everywhere in PostgreSQL and it's quite disturbing to be told that in fact we should not trust it. Would it make sense for you to produce a patch without the extra operators, behavior, opclass and opfamily here so that we can focus on the MatView parts of it? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Kevin Grittner kgri...@ymail.com writes: change, but if '1.4' was stored in a column copied into a matview and they later updated the source to '1.40' the increase in accuracy would not flow to the matview. That would be a bug, not a feature. Maybe the answer to that use case is to use the seg extension? http://www.postgresql.org/docs/9.3/interactive/seg.html You are arguing that we should provide lesser support for numeric columns (and who knows how many other types) in materialized views than we do in streaming replication, pg_dump, suppress_redundant_updates_trigger(), and other places? Why? IOW, colour me unconvinced about that binary-equality opclass use case in MatViews. We are trusting the btree equality operator about everywhere in PostgreSQL and it's quite disturbing to be told that in fact we should not trust it. Who said we should not trust it? I have said that equality in PostgreSQL does not mean the two values appear the same or behave the same in all cases -- the definer of the class gets to determine how many definitions of equality there are, and which (if any) is tied to the = operator name. That should not be news to anybody; it's in the documentation. I'm proposing a second definition of equality with a different operator name for comparing two records, without in any way disturbing the existing definition. I am taken completely by surprise that in this case creating a second opclass for something is somehow controversial. The documentation I cited previously provides a clear example of another case where two completely different concepts of equality for a type are useful. We have, as a community, gone to a fair amount of trouble to make the concept of equality pluggable and allow multiple types of equality per type. To me it seems the perfect tool to solve this problem. Why the fuss? Would it make sense for you to produce a patch without the extra operators, behavior, opclass and opfamily here so that we can focus on the MatView parts of it? No, matviews cannot be fixed without the new operators. Here are the stats on the patch: kgrittn@Kevin-Desktop:~/pg/master$ git diff --stat master..matview contrib/citext/expected/citext.out | 41 +++ contrib/citext/expected/citext_1.out | 41 +++ contrib/citext/sql/citext.sql | 23 ++ src/backend/commands/matview.c | 7 +- src/backend/utils/adt/rowtypes.c | 482 ++ src/include/catalog/pg_amop.h | 10 + src/include/catalog/pg_amproc.h | 1 + src/include/catalog/pg_opclass.h | 1 + src/include/catalog/pg_operator.h | 14 + src/include/catalog/pg_opfamily.h | 1 + src/include/catalog/pg_proc.h | 12 +- src/include/utils/builtins.h | 7 + src/test/regress/expected/opr_sanity.out | 7 +- 13 files changed, 642 insertions(+), 5 deletions(-) The changes to matview.c are the only ones that are matview-specific. Basically, that consists of using the new operator instead of = in a couple places. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/18/2013 09:19 PM, Dimitri Fontaine wrote: Kevin Grittner kgri...@ymail.com writes: change, but if '1.4' was stored in a column copied into a matview and they later updated the source to '1.40' the increase in accuracy would not flow to the matview. That would be a bug, not a feature. Maybe the answer to that use case is to use the seg extension? http://www.postgresql.org/docs/9.3/interactive/seg.html IOW, colour me unconvinced about that binary-equality opclass use case in MatViews. We are trusting the btree equality operator about everywhere in PostgreSQL and it's quite disturbing to be told that in fact we should not trust it. The problem is, that in this case the simple VIEW and MATVIEW would yield different results. -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/18/2013 09:41 PM, Kevin Grittner wrote: Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Kevin Grittner kgri...@ymail.com writes: change, but if '1.4' was stored in a column copied into a matview and they later updated the source to '1.40' the increase in accuracy would not flow to the matview. That would be a bug, not a feature. Maybe the answer to that use case is to use the seg extension? http://www.postgresql.org/docs/9.3/interactive/seg.html You are arguing that we should provide lesser support for numeric columns (and who knows how many other types) in materialized views than we do in streaming replication, pg_dump, suppress_redundant_updates_trigger(), and other places? Why? IOW, colour me unconvinced about that binary-equality opclass use case in MatViews. We are trusting the btree equality operator about everywhere in PostgreSQL and it's quite disturbing to be told that in fact we should not trust it. Who said we should not trust it? I have said that equality in PostgreSQL does not mean the two values appear the same or behave the same in all cases -- the definer of the class gets to determine how many definitions of equality there are, and which (if any) is tied to the = operator name. That should not be news to anybody; it's in the documentation. I'm proposing a second definition of equality with a different operator name for comparing two records, without in any way disturbing the existing definition. Basically what proposed === does is is guaranteed to be equal. If it is not *guaranteed* it is safe to re-evaluate, either using equal or something else. -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers