Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On 2/17/07, Tom Lane [EMAIL PROTECTED] wrote: Hannu Krosing [EMAIL PROTECTED] writes: How easy/hard would it be to create unique indexes on tinterval (unique here meaning non-overlapping) ? Overlapping is not an equality relation (it fails the transitive law), so I'm not entirely sure what unique means in this context ... but I can promise you you can't make it work with btree. Hmm, let's assume two time intervals: A (with a0 as start and a1 as end times) B (woth b0 as start and b1 as end times) Now, we'd define operators as: A is left of B when a0 b0 AND a1 b0 A is right of B when a0 b1 AND a1 b1 A is equal to B if (a0 = b0 AND a0 = b1) OR (a1 = b0 AND a1 = b1) OR (a0 b0 AND a1 b1) Actually equal doesn't mean equal here, rather it says overlaps. Now, assuming UNIQUE INDEX on such table, the order would be preserved since no two intervals can overlap. And no overlapping data could be inserted without breaking ovelapivity. And of course non-unique index would produce garbage (since left of/right of wouldn't make any sense anymore). Interestingly, such non-overlapping datatypes could also make sense for network addresses (with netmasks). ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On 17/02/07, Warren Turkal [EMAIL PROTECTED] wrote: PERIOD(INT) is actually listed in the Dr. Snodgrass's book. However, I am not really sure about the semantics of the type. When would you use a PERIOD(INT)? It wouldn't be directly useful for temporal SQL, but I have a number of tables in a database application where a range of integers is mapped onto a 0-100 range (e.g. 1-5 might get mapped onto 1, 6-15 to 2, etc), which I'd like to store using a (preferably non-overlapping) period type. Also, please bring this discussion back on list so that it gets recorded. I didn't want to post your private reply to the list without your permission. Oops, meant to reply to the list originally. Ian ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
Dawid Kuroczko [EMAIL PROTECTED] writes: ... Now, assuming UNIQUE INDEX on such table, the order would be preserved since no two intervals can overlap. And no overlapping data could be inserted without breaking ovelapivity. And of course non-unique index would produce garbage (since left of/right of wouldn't make any sense anymore). I think actually it doesn't work for unique indexes either :-( because of dead tuples. Consider that we have in the index ... (1,2) (6,8) DEAD (4,10) (12,14) ... Since under the given operators (6,8) and (4,10) are equal, btree will not guarantee that those index entries appear in any particular relative order. Thus the above is a legal index configuration. Now insert (3,5). This should surely be rejected because it overlaps (4,10). But what may well happen is that it gets compared to (1,2) --- OK, it's greater --- and to (6,8) --- OK, it's less --- and then the uniqueness check stops, because if it's less than (6,8) then there is no need to search further. Ooops. *This* is why the transitive law is essential. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
Ühel kenal päeval, L, 2007-02-17 kell 11:26, kirjutas Tom Lane: Hannu Krosing [EMAIL PROTECTED] writes: How easy/hard would it be to create unique indexes on tinterval (unique here meaning non-overlapping) ? Overlapping is not an equality relation (it fails the transitive law), so I'm not entirely sure what unique means in this context ... Well, unique is usually defined as not equal to any other. And not equal also fails transitive law. So I can't see, how failing it makes the defining the meaning of unique harder. What I mean by unique interval here is an interval over unique points, or just an interval which does not overlap any other interval. If our uniqueness implementation relies on reverse operation (equality) being transitive, then it can't be used for enforcing unique intervals. But it should be trivial to test at insertion time if the interval overlaps with any existing intervals, as it has to be inserted before the previous unique interval and after the next interval. In other words, unique interval index is a unique index of interval start times with extra condition that interval end time is not = than the next intervals start. but I can promise you you can't make it work with btree. Sorry to hear that. btree seemed like the best candidate for doing it. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On Sun, Feb 18, 2007 at 08:14:00PM +0200, Hannu Krosing wrote: but I can promise you you can't make it work with btree. Sorry to hear that. btree seemed like the best candidate for doing it. The problem with btree is that it's designed to work with a compare function which compares two datums and returns greater than, equal to or less than. You can't build such an operator for intervals, so there's a problem. However, if you decree that a zero return value mean collision for the purposes of a unique index then you could probably make it work. *However* using it for lookups probably won't work very well then... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Well, unique is usually defined as not equal to any other. And not equal also fails transitive law [...] But it should be trivial to test at insertion time if the interval overlaps with any existing intervals [...] Putting your point another way: you might construe an equivalence relation by grouping together all intervals which (directly or indirectly) touch each other. Let's say they are connected. But then the problem becomes clear: let's assume A and C are not connected (i.e. they are in different equivalence classes). Now you add B, which happens to overlap A and C. Now A and C are connected. How do you care for that in your index? That can't happen with a classical equivalence relation, which wouldn't change among existing elements when you add a new one. Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFF2TmLBcgs9XrR2kYRAmIHAJ4+x1mOum1rvBkS8/Pypcu8w2QIIQCffFm5 No5aOh901rxfc2mpRYpJMAU= =7Isi -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
Ühel kenal päeval, R, 2007-02-16 kell 17:39, kirjutas Alvaro Herrera: Jim C. Nasby wrote: My suggestion would be to focus on a period data type first and foremost, as that's something that could be readily used by a lot of folks. Of particular note, it's difficult to query tables that have start_time and end_time fields to define a period; it's easy to screw up the boundary conditions, and it's also hard to make those queries perform well without going to extra lengths (such as defining a 'bogus' GiST index on something like box(point(start,start),point(end,end)). And it's not possible to do that in a way that avoids floating points and their errors. FWIW there's already a type called tinterval that stores (start,end). I don't think it's very much documented; maybe it can be extended or used as base for a new, more complete and robust type, indexable in a more natural way, etc etc. How easy/hard would it be to create unique indexes on tinterval (unique here meaning non-overlapping) ? Is tinterval meant to be open/closed at start and end ? -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
Hannu Krosing [EMAIL PROTECTED] writes: How easy/hard would it be to create unique indexes on tinterval (unique here meaning non-overlapping) ? Overlapping is not an equality relation (it fails the transitive law), so I'm not entirely sure what unique means in this context ... but I can promise you you can't make it work with btree. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On Saturday 17 February 2007 01:50, Hannu Krosing wrote: Is tinterval meant to be open/closed at start and end ? I don't see the tinterval doing anything other than storing two times. wt -- Warren Turkal (w00t) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On Saturday 17 February 2007 09:26, Tom Lane wrote: Overlapping is not an equality relation (it fails the transitive law), so I'm not entirely sure what unique means in this context ... but I can promise you you can't make it work with btree. There is an equality relation on periods. But it wouldn't really tell you much useful info, as it's not normally what you're looking for with time. wt -- Warren Turkal (w00t) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On Sat, Feb 17, 2007 at 11:40:44AM -0700, Warren Turkal wrote: On Saturday 17 February 2007 09:26, Tom Lane wrote: Overlapping is not an equality relation (it fails the transitive law), so I'm not entirely sure what unique means in this context ... but I can promise you you can't make it work with btree. There is an equality relation on periods. But it wouldn't really tell you much useful info, as it's not normally what you're looking for with time. What he's referring to is that overlaps is not transitive. i.e. if A overlaps B and B overlaps C then A doesn't necessarily overlap C. However, non-overlapping intervals are stricly ordered, so if you reject overlaps from the index then new intervals can each only be inserted into one place. However, the locking required is probably non-trivial. Get unique indexes for GiST working and you're home... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
My suggestion would be to focus on a period data type first and foremost, as that's something that could be readily used by a lot of folks. Of particular note, it's difficult to query tables that have start_time and end_time fields to define a period; it's easy to screw up the boundary conditions, and it's also hard to make those queries perform well without going to extra lengths (such as defining a 'bogus' GiST index on something like box(point(start,start),point(end,end)). And it's not possible to do that in a way that avoids floating points and their errors. On Sat, Feb 10, 2007 at 12:20:28AM -0700, Warren Turkal wrote: Temporal Extensions for PostgreSQL by: Warren Turkal I would like to see a comprehensive solution to time varying tables (or temporal) in PostgreSQL. I specifically want to see suuport for valid-time and transacation-time and bitemporal (valid-time and transaction-time) tables. I will be defering the descriptions of much of the functionality to Dr. Richard T. Snodgrass's _Developing Time-Oriented Database Applications in SQL_ at [1]. The mangled pages 30-31 are at [2]. a) Functionality Dr. Richard T. Snodgrass has worked on defining semantics of temporal very completely in several writings. He was also involved in an unsuccessful effort to standardize temporal extensions to SQL. I believe his book does a good job in presenting the semantics of temporal databases and describing extensions to SQL that make the data much more natural with which to work. b) How current solutions fall flat Current solutions fall flat due to the extreme complexity of implementing valid-time and transaction time semantics on tables by adding columns to track all of the data. Please see chapter 11 of [1] for a more complete description of this complexity. Chapter 12 of [1] goes on to lay out new syntax for SQL that will make dealing with data of this nature much more natural. c) Examples --create normal table CREATE TABLE products ( id SERIAL PRIMARY KEY , description TEXT ); -- Add valid-time support to the table with granularity of timestamp. ALTER TABLE products ADD VALIDTIME PERIOD(TIMESTAMP WITH TIMEZONE); -- Insert row valid from 2006-01-01 to just before 2007-01-01 VALIDTIME PERIOD '[2006-01-01 - 2007-01-01)' INSERT INTO products ( description ) VALUES ( 'red ball' ); -- Insert row valid from 2007-01-01 to just before 2008-01-01 -- Should be smart enough to realize the id=777 does not conflict in this time -- of validity. VALIDTIME PERIOD '[2007-01-01 - 2008-01-01)' INSERT INTO products ( id , description ) VALUES ( 777 , 'blue ball' ); -- Select history of products with id=777 VALIDTIME SELECT * FROM product WHERE id=777; id | description | valid_period -- 777| red ball| [2006-01-01 - 2007-01-01) 777| blue ball | [2007-01-01 - 2008-01-01) -- Select current products with id=777 -- The date when query was run was 2007-02-10. SELECT * FROM products WHERE id=777; id | description -- 777| blue ball There are many more details in chapter 12 of [1]. d) New stuff (dependencies, indices, syntax, libraries) One of the base level additions is the PERIOD datatype. I think that implementing temporal support is reliant on developing such a type. The description of this datatype is laid out in chapter 4 of [1]. The SQL syntax is present in chapter 12 of [1]. I see this as the first piece that needs to be implemented in order to take steps toward a DBMS to supports full temporal capabilities. I think that PERIOD can largely reuse the datatime functionality for parsing of literals and for comparisons. The RTREE seems to nicely incorporate needed indexing of the PERIOD type. The syntax of the parser will have to be extended to handle the PERIOD literals and constructor. I believe any additional libraries will be required. There are also extensions to the syntax of table creation, table altering, querying, inserting, and updating on temporal tables. These are all discussed in some detail in chapter 12 of [1]. I don't think that any of these changes will require new libraries. The semantics of temporal tables and querying them could have a dramatic affect on how things like primary keys and unique constraints work. I would like to get some comments about this from the community. e) See Also Addtional resources can be found at Dr. Richard T. Snodgrass's website at [3], including SQL valid-time table support spec at [4] and SQL transaction-time table support spec at [5]. Thoughts? Questions? Comments? [1]http://www.cs.arizona.edu/~rts/tdbbook.pdf
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
Jim C. Nasby wrote: My suggestion would be to focus on a period data type first and foremost, as that's something that could be readily used by a lot of folks. Of particular note, it's difficult to query tables that have start_time and end_time fields to define a period; it's easy to screw up the boundary conditions, and it's also hard to make those queries perform well without going to extra lengths (such as defining a 'bogus' GiST index on something like box(point(start,start),point(end,end)). And it's not possible to do that in a way that avoids floating points and their errors. FWIW there's already a type called tinterval that stores (start,end). I don't think it's very much documented; maybe it can be extended or used as base for a new, more complete and robust type, indexable in a more natural way, etc etc. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On Fri, Feb 16, 2007 at 05:39:24PM -0300, Alvaro Herrera wrote: FWIW there's already a type called tinterval that stores (start,end). I don't think it's very much documented; maybe it can be extended or used as base for a new, more complete and robust type, indexable in a more natural way, etc etc. The book I cited has a very complete description of the period data type including details on what extensions to SQL are needed. I am very interested in starting a robust implementation of the period datatype. I think the datetime infrastructure will already do most of the needed parsing and packing of the hard parts of the period datatype (namely the date and time formats). I will investigate the tinterval to see if it meets the needs of the PERIOD datatypes. I agree with focusing on the PERIOD datatype. I think that is a major part of the foundation for temporal extensions and would have to be implemented first. Therefore, I present the following plan for getting there. 1) Focus first on PERIOD(DATE) to keep things as simple as possible. 2) Implement a first cut on the period datatype that only handles storing two dates. (Maybe tinterval will get us here for free?) 3) Add information to the datatype for open or closed interval for beginning and ending sides of the period. I could probably have this done in time for the freeze with some mentoring. I could probably even start implementation of some indices and operator function for the type. This functionality is what I expect to have a shot of making an appearance in 8.3. It will be minimally functional at this point. The next project will be altering the parser to be able to construct and operate on PERIOD types with the syntax extensions to SQL in Dr. Snodgrass's book. Once all of the syntax is implemented for PERIOD(DATE), the next project will be to extend to support PERIOD(DATETIME WITH TIMEZONE). Again, I think the datatime infrastructure will be very useful here. wt ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On Fri, 16 Feb 2007, Alvaro Herrera wrote: Jim C. Nasby wrote: My suggestion would be to focus on a period data type first and foremost, as that's something that could be readily used by a lot of folks. Of particular note, it's difficult to query tables that have start_time and end_time fields to define a period; it's easy to screw up the boundary conditions, and it's also hard to make those queries perform well without going to extra lengths (such as defining a 'bogus' GiST index on something like box(point(start,start),point(end,end)). And it's not possible to do that in a way that avoids floating points and their errors. FWIW there's already a type called tinterval that stores (start,end). I don't think it's very much documented; maybe it can be extended or used as base for a new, more complete and robust type, indexable in a more natural way, etc etc. RI-Tree (Relational intervar tree) http://www.dbs.informatik.uni-muenchen.de/Forschung/CAD/presentations/RI-Tree.pdf looks promising for that purposes. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] RFC: Temporal Extensions for PostgreSQL
Temporal Extensions for PostgreSQL by: Warren Turkal I would like to see a comprehensive solution to time varying tables (or temporal) in PostgreSQL. I specifically want to see suuport for valid-time and transacation-time and bitemporal (valid-time and transaction-time) tables. I will be defering the descriptions of much of the functionality to Dr. Richard T. Snodgrass's _Developing Time-Oriented Database Applications in SQL_ at [1]. The mangled pages 30-31 are at [2]. a) Functionality Dr. Richard T. Snodgrass has worked on defining semantics of temporal very completely in several writings. He was also involved in an unsuccessful effort to standardize temporal extensions to SQL. I believe his book does a good job in presenting the semantics of temporal databases and describing extensions to SQL that make the data much more natural with which to work. b) How current solutions fall flat Current solutions fall flat due to the extreme complexity of implementing valid-time and transaction time semantics on tables by adding columns to track all of the data. Please see chapter 11 of [1] for a more complete description of this complexity. Chapter 12 of [1] goes on to lay out new syntax for SQL that will make dealing with data of this nature much more natural. c) Examples --create normal table CREATE TABLE products ( id SERIAL PRIMARY KEY , description TEXT ); -- Add valid-time support to the table with granularity of timestamp. ALTER TABLE products ADD VALIDTIME PERIOD(TIMESTAMP WITH TIMEZONE); -- Insert row valid from 2006-01-01 to just before 2007-01-01 VALIDTIME PERIOD '[2006-01-01 - 2007-01-01)' INSERT INTO products ( description ) VALUES ( 'red ball' ); -- Insert row valid from 2007-01-01 to just before 2008-01-01 -- Should be smart enough to realize the id=777 does not conflict in this time -- of validity. VALIDTIME PERIOD '[2007-01-01 - 2008-01-01)' INSERT INTO products ( id , description ) VALUES ( 777 , 'blue ball' ); -- Select history of products with id=777 VALIDTIME SELECT * FROM product WHERE id=777; id | description | valid_period -- 777| red ball| [2006-01-01 - 2007-01-01) 777| blue ball | [2007-01-01 - 2008-01-01) -- Select current products with id=777 -- The date when query was run was 2007-02-10. SELECT * FROM products WHERE id=777; id | description -- 777| blue ball There are many more details in chapter 12 of [1]. d) New stuff (dependencies, indices, syntax, libraries) One of the base level additions is the PERIOD datatype. I think that implementing temporal support is reliant on developing such a type. The description of this datatype is laid out in chapter 4 of [1]. The SQL syntax is present in chapter 12 of [1]. I see this as the first piece that needs to be implemented in order to take steps toward a DBMS to supports full temporal capabilities. I think that PERIOD can largely reuse the datatime functionality for parsing of literals and for comparisons. The RTREE seems to nicely incorporate needed indexing of the PERIOD type. The syntax of the parser will have to be extended to handle the PERIOD literals and constructor. I believe any additional libraries will be required. There are also extensions to the syntax of table creation, table altering, querying, inserting, and updating on temporal tables. These are all discussed in some detail in chapter 12 of [1]. I don't think that any of these changes will require new libraries. The semantics of temporal tables and querying them could have a dramatic affect on how things like primary keys and unique constraints work. I would like to get some comments about this from the community. e) See Also Addtional resources can be found at Dr. Richard T. Snodgrass's website at [3], including SQL valid-time table support spec at [4] and SQL transaction-time table support spec at [5]. Thoughts? Questions? Comments? [1]http://www.cs.arizona.edu/~rts/tdbbook.pdf [2]http://www.cs.arizona.edu/~rts/pp30-31.pdf [3]http://www.cs.arizone.edu/~rts/ [4]ftp://ftp.cs.arizona.edu/tsql/tsql2/sql3/mad146.pdf [5]ftp://ftp.cs.arizona.edu/tsql/tsql2/sql3/mad147.pdf Thanks, wt -- Warren Turkal (w00t) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq