Re: [HACKERS] Range Types: empty ranges
On 2/11/2011 1:50 PM, Kevin Grittner wrote: Josh Berkusj...@agliodbs.com wrote: if I, in one of my applications, accidentally defined something as having the range '('15:15:00','15:15:00')', I would *want* the database to through an error and not accept it. I can agree with that, but I think that range '[15:15:00,15:15:00)' should be valid as a zero-length range between, for example, '[15:00:00,15:15:00)' and '[15:15:00,15:30:00)'. Does ['15:15:00','15:15:00') make any more sense? Doesn't this essentially mean = '15:15:00' '15:15:00' which again doesn't include a single point on the time line? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Jan Wieck janwi...@yahoo.com wrote: Does ['15:15:00','15:15:00') make any more sense? Doesn't this essentially mean = '15:15:00' '15:15:00' which again doesn't include a single point on the time line? It defines a position in time with zero duration. Some of the graphics programming I've done in the past was based on a system where, at the pixel level, the coordinates referred to the boundaries *between* the pixels. If you were to have a number of horizontal bars, for example, the size of which you would adjust to represent fluctuating data, you might have an x coordinate of 20 for the left edge, and [20,30) would paint 10 pixels. I guess you *could* destroy and recreate the object when the number dropped to zero and came off it again; but the concept of [20,20) to draw zero pixels but maintain the positional anchor can be convenient. I see parallel concepts for some data domains in a database. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On 2/11/11 10:11 AM, Jeff Davis wrote: Thoughts? Do the benefits outweigh the costs? I guess I'm having trouble tying the concept of empty ranges to any reality external to the database. For example, what would the time range: '('15:15:00','15:15:00')' ... represent exactly? A non-existant point in time which might or might not be near 3:15 PM? For my personal use of ranges, I'm very reluctant to embrace a concept in the database which can't possibly depict something concrete. Really, it seems like you're trying to fix NULL by separating the concept of EMPTY from NULL. Which is a good idea in general, I'm just not sure that this is the way to do it. HOWEVER, I also recognize that range types might be used for scientific applications, in which the mathematical concepts of imaginary ranges and empty ranges might be valid. So I wouldn't want to prohibit this feature for the people who need it. BUT ... if I, in one of my applications, accidentally defined something as having the range '('15:15:00','15:15:00')', I would *want* the database to through an error and not accept it. So, if we allow empty ranges of this kind, I would want a GUC for allow_empty_ranges. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Jeff Davis pg...@j-davis.com wrote: The philosophy is that they are essentially the zero value of any range type. Like the number zero, it allows closure over operations that would otherwise return an error. Well, zero has a pretty well defined and easy to understand meaning. How many ostriches do you have in your back yard? In my case, zero. That's distinct from NULL, which would mean the answer is unknown. Similarly, intersection of ranges is somewhat analogous to multiplication of numbers. Ah, so the point is that the intersection of '[13:00,15:00)' and '[08:00,10:00)' is the empty range, not NULL -- because you *know* they don't overlap? That seems useful. What seems not useful is to say that such a value is both strictly to the right *and* strictly to the left of a range. That seems to fly in the face of the normal meaning of strict. Nor does it seem reasonable to say that it is adjacent to a range. It seems like it would be useful to be able to say that if A is adjacent to B and B is adjacent to C that the three ranges could be considered as one contiguous range. You'd get in trouble with such an assumption if an empty B was considered adjacent to all ranges. I see similar problems with an empty B if A is strictly to the left of B and B is strictly to the left of C, where C is strictly to the left of A. I'm somewhat inclined to agree with Robert that such comparisons should not throw errors. At this point, my inclination is to say that UNKNOWN should be the result (which I believe is equivalent to NULL::boolean). -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Josh Berkus j...@agliodbs.com wrote: if I, in one of my applications, accidentally defined something as having the range '('15:15:00','15:15:00')', I would *want* the database to through an error and not accept it. I can agree with that, but I think that range '[15:15:00,15:15:00)' should be valid as a zero-length range between, for example, '[15:00:00,15:15:00)' and '[15:15:00,15:30:00)'. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, Feb 11, 2011 at 1:11 PM, Jeff Davis pg...@j-davis.com wrote: Similarly, intersection of ranges is somewhat analogous to multiplication of numbers. I had a feeling that we might be going in this direction. It strikes me that this case is a bit like division by zero. It's kind of a nuisance that dividing by zero throws an error and we COULD fix that by making it return NULL or NaN or some new distinguished value DbZ. But then we'd have to define what happens when you feed DbZ into every other operation in the system, and similarly here. If we define two non-overlapping ranges as intersecting to NULL, or as throwing an error, then everything else is clear after that. I'm not sure it's worth complicating the representation and the definitions of other operations to cater to this case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, Feb 11, 2011 at 1:50 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Josh Berkus j...@agliodbs.com wrote: if I, in one of my applications, accidentally defined something as having the range '('15:15:00','15:15:00')', I would *want* the database to through an error and not accept it. I can agree with that, but I think that range '[15:15:00,15:15:00)' should be valid as a zero-length range between, for example, '[15:00:00,15:15:00)' and '[15:15:00,15:30:00)'. How would that actually work? I kind of agree with Josh: I'd be inclined to make the type input function boot such constructs at the door. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, 2011-02-11 at 10:28 -0800, Josh Berkus wrote: I guess I'm having trouble tying the concept of empty ranges to any reality external to the database. That's true, but in the same sense as zero has no meaning outside of the database. It's really that it has nice mathematical properties coming from set theory. Take the distributive law: A UNION (B INTERSECT C) = (A UNION B) INTERSECT (A UNION C) If (B INTERSECT C) is empty, then the result should be A. With empty ranges, that holds on both sides. Without them, it doesn't hold on the left side. People implicitly use this kind of logic all the time when constructing queries. If one form works, and the other throws an error, it will cause confusion. For example, what would the time range: '('15:15:00','15:15:00')' ... represent exactly? A non-existant point in time which might or might not be near 3:15 PM? That's meaningless and will throw an error. An empty range is not anchored at any particular point, so any two empty ranges are equal. BUT ... if I, in one of my applications, accidentally defined something as having the range '('15:15:00','15:15:00')', I would *want* the database to through an error and not accept it. Absolutely. That kind of input should throw an error (and does). So, if we allow empty ranges of this kind, I would want a GUC for allow_empty_ranges. I think that would be the least desirable option. If we don't like empty ranges, let's prohibit them entirely. Or, there are always check constraints... Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Robert Haas robertmh...@gmail.com wrote: I think that range '[15:15:00,15:15:00)' should be valid as a zero-length range between, for example, '[15:00:00,15:15:00)' and '[15:15:00,15:30:00)'. How would that actually work? I kind of agree with Josh: I'd be inclined to make the type input function boot such constructs at the door. It makes more sense in the context of a range of some type with a clearly defined granularity. Our accounting system, for example, can assign a new range of receipt IDs for each calendar year. If you want a variable to represent the receipts for traffic receipts for 2012, you might, in preparation for the upcoming year, define something to declare the range as '[12TR01,12TR01)'. When the first receipt is assigned as 12TR01, that is updated to '[12TR01,12TR02)'. Just as an off-the-cuff example. Basically, with a type having well-defined granularity, a [) range could usefully represent, start to last used, and start out empty. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, Feb 11, 2011 at 2:06 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2011-02-11 at 10:28 -0800, Josh Berkus wrote: I guess I'm having trouble tying the concept of empty ranges to any reality external to the database. That's true, but in the same sense as zero has no meaning outside of the database. It's really that it has nice mathematical properties coming from set theory. Take the distributive law: A UNION (B INTERSECT C) = (A UNION B) INTERSECT (A UNION C) But the basic range type isn't even closed under UNION. It seems like what you really need if you want this stuff to work out smoothly is a MULTIRANGE type. Then this problem goes away: the union or intersection of two ranges is a multirange, and after that everything is closed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Basically, with a type having well-defined granularity, a [) range could usefully represent, start to last used, and start out empty. I guess that would actually be start to next available -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, 2011-02-11 at 13:08 -0600, Kevin Grittner wrote: It makes more sense in the context of a range of some type with a clearly defined granularity. Our accounting system, for example, can assign a new range of receipt IDs for each calendar year. If you want a variable to represent the receipts for traffic receipts for 2012, you might, in preparation for the upcoming year, define something to declare the range as '[12TR01,12TR01)'. When the first receipt is assigned as 12TR01, that is updated to '[12TR01,12TR02)'. Just as an off-the-cuff example. Basically, with a type having well-defined granularity, a [) range could usefully represent, start to last used, and start out empty. I think this is trying to make a range into something that's not. A range is a set of values with the additional constraint that there are no gaps. Trying to incorporate a start value is adding extra information in there, and it's not really a part of the same algebra. It sounds more like a contiguous sequence with a start value and a current value to me. Sequences have other useful operations, like next, and things like overlaps don't really seem to make sense (at least not in a practical way that I can tell). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Feb 11, 2011, at 10:28 AM, Josh Berkus wrote: So, if we allow empty ranges of this kind, I would want a GUC for allow_empty_ranges. GUC you, josh. ;-P D -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, 2011-02-11 at 13:50 -0500, Robert Haas wrote: On Fri, Feb 11, 2011 at 1:11 PM, Jeff Davis pg...@j-davis.com wrote: Similarly, intersection of ranges is somewhat analogous to multiplication of numbers. I had a feeling that we might be going in this direction. It strikes me that this case is a bit like division by zero. Except that we do happen to allow the value zero and wait 'til someone divides by it before throwing an error. So I think that's more of a point toward allowing empty ranges than rejecting them. It's kind of a nuisance that dividing by zero throws an error and we COULD fix that by making it return NULL or NaN or some new distinguished value DbZ. But empty ranges are actually quite well-defined, in a way similar to an empty set. * it can meaningfully result in a non-empty range at a later stage of computation * it increases the number of tautologies, rather than decreasing them like NULL I guess what I'm saying is that DbZ doesn't seem particularly useful to carry along, while and empty range plausibly is. But then we'd have to define what happens when you feed DbZ into every other operation in the system, and similarly here. If your point is that empty ranges need to be handled specially sometimes, I agree. That is the semantic cost which I identified in the original email. Are the benefits worth it? If we define two non-overlapping ranges as intersecting to NULL, or as throwing an error, then everything else is clear after that. Well, there is a certain amount of localized clarity, I will agree with that. The complexity comes when you accidentally rely on some transformation which seems logically sound, but could result in a transient empty range, which then throws an error. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, 2011-02-11 at 14:14 -0500, Robert Haas wrote: It's really that it has nice mathematical properties coming from set theory. Take the distributive law: A UNION (B INTERSECT C) = (A UNION B) INTERSECT (A UNION C) But the basic range type isn't even closed under UNION. An excellent point. Allow me to move the target a little: WHERE A B AND A C and: WHERE A (B INTERSECT C) That seems like a logically sound transformation, but if (B INTERSECT C) is empty, it relies on the empty range for those two to be equivalent. And that would be a runtime error, caught during testing only if you're lucky. Now, I agree that lack of closure on UNION exhibits many of the problems that I am pointing out related to forbidding empty ranges. However, I'm not sure if that means we should give up on either. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, Feb 11, 2011 at 3:03 PM, Jeff Davis pg...@j-davis.com wrote: Well, there is a certain amount of localized clarity, I will agree with that. The complexity comes when you accidentally rely on some transformation which seems logically sound, but could result in a transient empty range, which then throws an error. But by this argument you also need to support discontiguous ranges, don't you? I mean, if you want to insist that A intersect B has to still be a legal range, what about A union B? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Jeff Davis pg...@j-davis.com wrote: Trying to incorporate a start value is adding extra information in there, and it's not really a part of the same algebra. It sounds more like a contiguous sequence with a start value and a current value to me. Well, in the receipt number example there are multiple ranges in use for each year, and ranges for multiple years. If we get to the idea of a multi-ranges, this would be very handy for certain types of reports -- especially for auditors. It's not that we can't do with with discrete begin and end columns -- we do that now; but it seemed a potentially beneficial use of ranges for us, if they can represent the needed states. People already talk about these as ranges, just in terms of the common English understanding of the word. Perhaps it was a mistake to get so concrete rather than conceptual -- basically, it seems like it could be a useful concept for any planned or scheduled range with an indeterminate end point, which you want to reserve up front and record in progress until complete. The alternative would be that such ranges to be have a parallel planned start value column of the same type as the range, to be used as the start of the range once it is not empty. Or, as another way to put it, it seems potentially useful to me to have an empty range which is pinned to a location, in *addition* to the unpinned empty ranges such as would be needed to represent the intersection of two non-overlapping ranges. Of course, the *most* useful places for our shop to have ranges are temporal. Many (most?) of those are situations where you start with a range with an unknown end and later (often years later) fill in the end of the range based on some event which finally closes it. Again, two discreet dates with a null-capable end-date work, but I can see where ranges could potentially be more powerful. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, 2011-02-11 at 14:19 -0600, Kevin Grittner wrote: Well, in the receipt number example there are multiple ranges in use for each year, and ranges for multiple years. If we get to the idea of a multi-ranges, this would be very handy for certain types of reports -- especially for auditors. It's not that we can't do with with discrete begin and end columns -- we do that now; but it seemed a potentially beneficial use of ranges for us, if they can represent the needed states. People already talk about these as ranges, just in terms of the common English understanding of the word. I think that might indicate that the word range is a little too loose. The definition that I've been using is more like a mathematical interval. Perhaps it was a mistake to get so concrete rather than conceptual -- basically, it seems like it could be a useful concept for any planned or scheduled range with an indeterminate end point, which you want to reserve up front and record in progress until complete. Maybe this is a range... would it be served by: (5, INF) or: [5, INF) ? That's already supported, and it means all points greater than 5. Of course, the *most* useful places for our shop to have ranges are temporal. Many (most?) of those are situations where you start with a range with an unknown end and later (often years later) fill in the end of the range based on some event which finally closes it. Again, two discreet dates with a null-capable end-date work, but I can see where ranges could potentially be more powerful. Ranges support infinite boundaries, but do not support NULL (previous discussion concluded that NULL boundaries were likely to be confusing and served no obvious use case not handled by infinity). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, 2011-02-11 at 15:14 -0500, Robert Haas wrote: On Fri, Feb 11, 2011 at 3:03 PM, Jeff Davis pg...@j-davis.com wrote: Well, there is a certain amount of localized clarity, I will agree with that. The complexity comes when you accidentally rely on some transformation which seems logically sound, but could result in a transient empty range, which then throws an error. But by this argument you also need to support discontiguous ranges, don't you? I mean, if you want to insist that A intersect B has to still be a legal range, what about A union B? I responded to a similar question/point here: http://archives.postgresql.org/pgsql-hackers/2011-02/msg01073.php Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
Jeff Davis pg...@j-davis.com wrote: Perhaps it was a mistake to get so concrete rather than conceptual -- basically, it seems like it could be a useful concept for any planned or scheduled range with an indeterminate end point, which you want to reserve up front and record in progress until complete. Maybe this is a range... would it be served by: (5, INF) or: [5, INF) ? That's already supported, and it means all points greater than 5. Well that would create a range of infinite size until you started using it, at which point it would drop to tiny and start to increase until completion, which would be kinda weird as well as error-prone. At least for the uses I'm considering. I'm sure we can work around it if it isn't supported, but for our purposes, [x,x) ranges would be a useful construct. Of course, the *most* useful places for our shop to have ranges are temporal. Many (most?) of those are situations where you start with a range with an unknown end and later (often years later) fill in the end of the range based on some event which finally closes it. Again, two discreet dates with a null-capable end-date work, but I can see where ranges could potentially be more powerful. Ranges support infinite boundaries, but do not support NULL (previous discussion concluded that NULL boundaries were likely to be confusing and served no obvious use case not handled by infinity). Yeah, infinity works fine as long as you realize it's only temporary. I don't know whether there are any practical uses where on the low end it doesn't mean some particular date we don't know or don't care enough to determine and on the high end doesn't mean some particular date which we don't know or which hasn't yet been fixed. For the latter we have, so far, been using NULL, since we were trying to stay portable and not all products support infinity. We're committing to PostgreSQL now, so we could start to use infinity for these temporarily indefinite values, and I would expect to do so in ranges. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
FWIW, a very informal survey of probabilists didn't yield any reason for trying to put an order on the empty set ( unless the metric was cardinality or other equivalence relation ). I think the problem here is that the idea of union and intersection forming a ring over sets is being conflated with the order relation. Clearly, combining the two notions can be inconsistent. However... A UNION (B INTERSECT C) = (A UNION B) INTERSECT (A UNION C) But the basic range type isn't even closed under UNION. An excellent point. Allow me to move the target a little: WHERE A B AND A C and: WHERE A (B INTERSECT C) That seems like a logically sound transformation, but if (B INTERSECT C) is empty, it relies on the empty range for those two to be equivalent. Now, I agree that lack of closure on UNION exhibits many of the problems that I am pointing out related to forbidding empty ranges. However, I'm not sure if that means we should give up on either. This seems potentially very useful, because we can transform WHERE A B AND A C from a bitmap scan into WHERE A (B INTERSECT C), a simple index scan. In the union case ( even if we had a type that supported disjoint intervals), I doubt we would ever make that transformation because the index will probably still be over connected intervals. So, +1 for keeping it how it is ( but maybe with a better error message ). Best, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Types: empty ranges
On Fri, Feb 11, 2011 at 10:11:45AM -0800, Jeff Davis wrote: The cost, of course, is that not all operations are well-defined for empty ranges. I think those are mostly operators like those mentioned in the other thread: (strictly right of), (strictly left of), and -|- (adjacent); and perhaps and . These are probably used a little less frequently, and should probably not be used in a context where empty ranges are permitted (if they are, it's likely a mistake and an error should be thrown). But surely this is just a matter of your definitions? If you define A strictly left of B as all points in A are less than all points in B, then you might have a problem with all points of an empty range. If you define it as not (any points in A to the right of any points in B) then the answer for an empty range is well defined (namely true). I think once you make workable definitions then empty ranges should pose no problems. Sure, it may be that an empty range will be both to the left and to the right of every other set. That doesn't make it wrong. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature