[Fwd: Re: [HACKERS] SQL99 Hierarchical queries]
Looks like hierarchical queries are now officially stalled :( Anyone want to take this up for 8.1? Chris Original Message Subject: Re: [HACKERS] SQL99 Hierarchical queries Date: Sun, 15 May 2005 07:31:16 +0400 From: Evgen Potemkin [EMAIL PROTECTED] Reply-To: Evgen Potemkin [EMAIL PROTECTED] To: Christopher Kings-Lynne [EMAIL PROTECTED] References: [EMAIL PROTECTED] [EMAIL PROTECTED] Hi, I haven't done any significant progress on that way because of lack of free time. Beside this, I'm recently changed my job and now I'm woking for MySQL. I think it's not possible for me to continue work on PostgreSQL. Feel free to take the patch and develop it further as long as you mention me as author of initial version. Regards, Evgen. On 5/5/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote: Hi Evgen, I just keep pinging this patch thread every once in a while to make sure it doesn't get forgotten :) How is the syncing with 8.1 CVS coming along? Chris Evgen Potemkin wrote: Hi hackers! I have done initial implementation of SQL99 WITH clause (attached). It's now only for v7.3.4 and haven't a lot of checks and restrictions. It can execute only simple WITH queries like WITH tree AS (SELECT id,pnt,name,1::int AS level FROM t WHERE id=0 UNION SELECT t.id,t.pnt,t.name,tree.level+1 FROM t JOIN tree ON tree.id=t.pnt ) SELECT * FROM tree; It would be great if someone with knowledge of Pg internals can make a kind of code review and make some advices how to better implement all this. Regards, Evgen. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Planned change of ExecRestrPos API
I'm planning to change ExecRestrPos and the routines it calls so that an updated TupleTableSlot holding the restored-to tuple is explicitly returned. Currently, since nothing is explicitly done to the result Slot of a plan node when we restore its position, you might think that the Slot still points at the tuple that was current just before the Restore. You'd be wrong though, at least for seqscan and indexscan plans (I haven't looked yet at the other node types that support mark/restore). The reason is that the restore operation changes the contents of a HeapTupleData struct in the scan state (rs_ctup or xs_ctup) and all that the Slot really contains is a pointer to that struct. Now this is really bad. In the first place, the Slot thinks it has a pin on the buffer containing its current tuple. After a Restore, it may have pin on the wrong buffer. It seems to be sheer chance that we've not had bugs due to this. (The underlying scan does have pin on the right buffer, but one can easily imagine sequences in which the scan could be cleared while the Slot is still assumed valid.) As of CVS tip the consequences could be even worse, because the Slot may contain some pointers to extracted fields of the tuple, and these pointers are now out of sync with the tuple that the Slot really contains. So I think that it's essential that we explicitly update the scan result Slot during ExecRestrPos. It seems to be a good idea also to make the function return the Slot. As far as I can tell, nodeMergeJoin has been depending on the assumption that the physical address of the result slot doesn't change during Restore. Which is true for all the current plan types, but since the ExecProcNode API isn't designed to assume that a node always returns the same Slot, it doesn't seem like ExecRestrPos should either. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Best way to scan on-disk bitmaps
Bruce Momjian pgman@candle.pha.pa.us writes: Hmm. That particular case will work, but the planner believes that only consecutive columns in the index are usable --- that is, if you have quals for a and c but not for b, it will think that the condition for c isn't usable with the index. This is true for btree and gist indexes, so I suppose we'd need to introduce a pg_am column that tells what to do. We do have a TODO for this: * Use index to restrict rows returned by multi-key index when used with non-consecutive keys to reduce heap accesses For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3 = 9, spin though the index checking for col1 and col3 matches, rather than just col1; also called skip-scanning. That TODO is something else. Though it is related in that it is another example of why the existing code is too simplistic and will eventually need to be enhanced. Not only would bitmap indexes and (possibly) gist indexes, but even btree indexes would need to do so if this TODO were implemented. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Views, views, views! (long)
On Wed, 4 May 2005 21:37:40 -0700, Josh Berkus josh@agliodbs.com wrote: As stated above, these system views, once incorporated into a pg distribution, are likely to be with us *forever*. I don't think that this is doable. :-( You might want to put the system views into a version specific schema, say pg_views81. The next PG version will contain a new schema pg_views82 plus a version of 8.1 views that have been adapted to new features and catalog structures as far as possible without breaking compatibility. Ideally the views in pg_views81 and pg_views82 will look the same, but most likely there will be some differences. In PG 8.3 we will have schemas pg_views81, pg_views82, and pg_views83 ... Obviously it will get harder and harder to maintain older system view schemas with each new Postgres version. If in PG 8.7 it becomes clear that carrying on pg_views81 doesn't make sense any more, you simply drop it. By that time tool vendors should have had enough time to make their tools compatible with pg_views8x, for some x = 2. Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Best way to scan on-disk bitmaps
On Thu, 12 May 2005 17:40:06 -0400, Tom Lane [EMAIL PROTECTED] wrote: the planner believes that only consecutive columns in the index are usable --- that is, if you have quals for a and c but not for b, it will think that the condition for c isn't usable with the index. This is true for btree [...] It's not difficult to setup a test case where an index is used, but only with a=42 as an index condition, and c='foo' is applied as a filter condition. Now add a redundant condition on b ... AND b BETWEEN minb AND maxb ... and watch how c='foo' moves into the index condition. I did this test some time ago and I believe that adding the condition on b did not change the estimated cost, only the actual execution time. Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Best way to scan on-disk bitmaps
Greg Stark [EMAIL PROTECTED] writes: Bruce Momjian pgman@candle.pha.pa.us writes: Hmm. That particular case will work, but the planner believes that only consecutive columns in the index are usable --- that is, if you have quals for a and c but not for b, it will think that the condition for c isn't usable with the index. We do have a TODO for this: * Use index to restrict rows returned by multi-key index when used with non-consecutive keys to reduce heap accesses That TODO is something else. No, I think Bruce is right --- it's essentially the same thing. It certainly would be a good idea to change btrees to work like that, if we are going to modify the planner to relax the restriction for other index types. I think it would be easy to change the planner and btree to handle this (where easy means I remember where all the skeletons are buried). But I don't know the gist code hardly at all. Can anyone offer an informed opinion on whether gist can handle this now, and if not what it would take to handle it? (hash and rtree are not at issue since they don't support multi-key indexes.) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Best way to scan on-disk bitmaps
On Sun, 15 May 2005, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: Bruce Momjian pgman@candle.pha.pa.us writes: Hmm. That particular case will work, but the planner believes that only consecutive columns in the index are usable --- that is, if you have quals for a and c but not for b, it will think that the condition for c isn't usable with the index. We do have a TODO for this: * Use index to restrict rows returned by multi-key index when used with non-consecutive keys to reduce heap accesses That TODO is something else. No, I think Bruce is right --- it's essentially the same thing. It certainly would be a good idea to change btrees to work like that, if we are going to modify the planner to relax the restriction for other index types. I think it would be easy to change the planner and btree to handle this (where easy means I remember where all the skeletons are buried). But I don't know the gist code hardly at all. Can anyone offer an informed opinion on whether gist can handle this now, and if not what it would take to handle it? I think that handling this in GiST is depends solely on how users consistent function designed to handle NULLs in keys. Other words, it should works as soon as users consistent function know what to do with NULLs in internal keys. Teodor will comment multi-key GiST tomorrow. We used Paul Aoki paper Generalizing ''Search'' in Generalized Search Trees, (available from http://www.sai.msu.su/~megera/postgres/gist/papers/csd-97-950.pdf ) for implementation of multi-key GiST index support. It's true, that first key is used for splitting, but elements with duplicated first key could be shuffled to get better clustering on second key. (hash and rtree are not at issue since they don't support multi-key indexes.) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Planned change of ExecRestrPos API
I wrote: Currently, since nothing is explicitly done to the result Slot of a plan node when we restore its position, you might think that the Slot still points at the tuple that was current just before the Restore. You'd be wrong though, at least for seqscan and indexscan plans (I haven't looked yet at the other node types that support mark/restore). Actually, on looking closer, only seqscans have this problem --- and ExecSeqRestrPos is really dead code anyway at the moment. So rather than go through a large exercise to change the mark/restore API, I've just added some comments about what the API actually guarantees, and tweaked ExecSeqRestrPos to clear the output slot instead of leaving it in a potentially inconsistent state. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [ADMIN] Permissions not removed when group dropped
[Moved to -hackers] On Sat, May 14, 2005 at 11:32:23AM -0400, Tom Lane wrote: So what we've got [for DROP USER] is: 1. Reject if any references to user from within other databases (implementation restriction). 2. Reject if user owns any databases or tablespaces (safety feature). 3. If CASCADE mode, drop any owned objects within the current database; if RESTRICT mode, reject if there are owned objects within the current database. 4. Auto-drop any remaining references (ACLs and group memberships). 5. Drop the user itself. Sounds reasonable. Additionally we need to think what should happen if the user is the grantor of some privilege. I think we should warn in RESTRICT mode, and in CASCADE, revoke the privilege from the grantee. The main problem I see with this is that if you do have a user you want to get rid of who owns objects in multiple databases, it's still mighty hard to do it. It'd be nice to have some kind of command that either drops or reassigns ownership of everything the user has in the current database. Then you could use that repeatedly until you'd reached a point where DROP USER would work. Hmm. We could implement something like DROP USER LOCALLY [CASCADE | RESTRICT], which would be a very misleading name for operations 2-4 above. Additionally, if the user doesn't have references in other databases, drop the user itself. (Note it's inconsistent.) For DROP TABLESPACE, I think we should also provide some sort of DROP LOCALLY, that drops all objects in the current database. (Suggestions for the exact spelling of the option are welcome.) -- Alvaro Herrera (alvherre[a]surnet.cl) La gente vulgar solo piensa en pasar el tiempo; el que tiene talento, en aprovecharlo ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Planned change of ExecRestrPos API
On Sun, 2005-05-15 at 15:09 -0400, Tom Lane wrote: I'm planning to change ExecRestrPos and the routines it calls so that an updated TupleTableSlot holding the restored-to tuple is explicitly returned. Currently, since nothing is explicitly done to the result Slot of a plan node when we restore its position, you might think that the Slot still points at the tuple that was current just before the Restore. You'd be wrong though, at least for seqscan and indexscan plans (I haven't looked yet at the other node types that support mark/restore). The reason is that the restore operation changes the contents of a HeapTupleData struct in the scan state (rs_ctup or xs_ctup) and all that the Slot really contains is a pointer to that struct. Now this is really bad. In the first place, the Slot thinks it has a pin on the buffer containing its current tuple. After a Restore, it may have pin on the wrong buffer. Sounds terrible. Is this a particular bug you're fixing? It seems to be sheer chance that we've not had bugs due to this. It isn't a very common case, thats why. You'd need to have value duplicates in both joined columns, which is effectively a product join. Granted it is syntactically allowable SQL. AFAICS all joins should be between relations 1:1 or 1:M. A direct M:M join is actually missing out the associative relation, or a non-key self join. Such a join would rarely if ever have any correct and real meaning. I can think of a few, but mostly its just incorrectly coded SQL, or use of special values (e.g. blanks) rather than NULLs. So my guess is that ExecRestrPos is almost never called. (The underlying scan does have pin on the right buffer, but one can easily imagine sequences in which the scan could be cleared while the Slot is still assumed valid.) As of CVS tip the consequences could be even worse, because the Slot may contain some pointers to extracted fields of the tuple, and these pointers are now out of sync with the tuple that the Slot really contains. So I think that it's essential that we explicitly update the scan result Slot during ExecRestrPos. Yeh, no problem. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [ADMIN] Permissions not removed when group dropped
Alvaro Herrera [EMAIL PROTECTED] writes: Additionally we need to think what should happen if the user is the grantor of some privilege. I think we should warn in RESTRICT mode, and in CASCADE, revoke the privilege from the grantee. You mean fail in RESTRICT mode, no? Hmm. We could implement something like DROP USER LOCALLY [CASCADE | RESTRICT], which would be a very misleading name for operations 2-4 above. Additionally, if the user doesn't have references in other databases, drop the user itself. (Note it's inconsistent.) I'd go for something more like DROP OWNED OBJECTS, which'd be just the stuff internal to the current database (owned objects and ACL entries). You don't need to drop group memberships per-database. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PostgreSQL running out of file handles
I suppose you are running on some BSD variant? BSD is notorious for promising more than it can deliver with respect to number of open files per process. This is a kernel bug, not a Postgres bug. You can adjust Postgres' max_files_per_process setting to compensate for the kernel's lying about its capabilities. (Postgres is in fact one of the most robust applications I know of in terms of not going belly-up in response to EMFILE or ENFILE. However, if there are not any spare descriptors to close, there's not a lot we can do except fail.) regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [ADMIN] Permissions not removed when group dropped
On Sun, May 15, 2005 at 05:48:56PM -0400, Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Additionally we need to think what should happen if the user is the grantor of some privilege. I think we should warn in RESTRICT mode, and in CASCADE, revoke the privilege from the grantee. You mean fail in RESTRICT mode, no? Yes, with a message indicating what happened. Hmm. We could implement something like DROP USER LOCALLY [CASCADE | RESTRICT], which would be a very misleading name for operations 2-4 above. Additionally, if the user doesn't have references in other databases, drop the user itself. (Note it's inconsistent.) I'd go for something more like DROP OWNED OBJECTS, which'd be just the stuff internal to the current database (owned objects and ACL entries). You don't need to drop group memberships per-database. Ok. -- Alvaro Herrera (alvherre[a]surnet.cl) Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever (Oliver Silfridge) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] bitmap scans, btree scans, and tid order
About this time last year I was holding forth on the value of visiting the heap in TID order, even when index scan tuples are randomly ordered. Today I decided to start working on the problem stated in this TODO item: Fetch heap pages matching index entries in sequential order Rather than randomly accessing heap pages based on index entries, mark heap pages needing access in a bitmap and do the lookups in sequential order. Another method would be to sort heap ctids matching the index before accessing the heap rows. I see that Tom has already done the infrastructure work by adding getmulti, but getmulti isn't used by nodeIndexscan.c, only nodeBitmapIndexscan.c. Will btree index scans be executed by creating in-memory bitmaps in 8.1, or will some scans still be executed the usual way? If the former, I'd be wasting time, but in the latter case it would be worth it. -jwb ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [Fwd: Re: [HACKERS] SQL99 Hierarchical queries]
On Sun, May 15, 2005 at 04:44:57PM +0800, Christopher Kings-Lynne wrote: Looks like hierarchical queries are now officially stalled :( Anyone want to take this up for 8.1? Sergei and Jason, Feel like taking SQL:1999 WITH RECURSIVE? It would be a giant help to the PostgreSQL community. :) http://gppl.moonbone.ru/index.shtml has part of it, and http://candle.pha.pa.us/mhonarc/patches2/msg00175.html has others. There's also MERGE, which is covered starting on page 47 of http://wiscorp.com/sql/SQL2003Features.pdf also pp 839-845 of 5WD-02-Foundation-2003-09.pdf which is part of this: http://wiscorp.com/sql/sql_2003_standard.zip and an overview here: http://www.varlena.com/varlena/GeneralBits/73.php Cheers, D Hi, I haven't done any significant progress on that way because of lack of free time. Beside this, I'm recently changed my job and now I'm woking for MySQL. I think it's not possible for me to continue work on PostgreSQL. Feel free to take the patch and develop it further as long as you mention me as author of initial version. Regards, Evgen. On 5/5/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote: Hi Evgen, I just keep pinging this patch thread every once in a while to make sure it doesn't get forgotten :) How is the syncing with 8.1 CVS coming along? Chris Evgen Potemkin wrote: Hi hackers! I have done initial implementation of SQL99 WITH clause (attached). It's now only for v7.3.4 and haven't a lot of checks and restrictions. It can execute only simple WITH queries like WITH tree AS (SELECT id,pnt,name,1::int AS level FROM t WHERE id=0 UNION SELECT t.id,t.pnt,t.name,tree.level+1 FROM t JOIN tree ON tree.id=t.pnt ) SELECT * FROM tree; It would be great if someone with knowledge of Pg internals can make a kind of code review and make some advices how to better implement all this. Regards, Evgen. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings On Wed, May 11, 2005 at 07:01:50PM -0700, Sergey Ten wrote: Hello all, We would like to contribute to the Postgresql community by implementing the following items from the TODO list (http://developer.postgresql.org/todo.php): . Allow COPY to understand \x as a hex byte . Allow COPY to optionally include column headings in the first line . Add XML output to COPY The changes are straightforward and include implementation of the features as well as modification of the regression tests and documentation. Before sending a diff file with the changes, we would like to know if these features have been already implemented. Best regards, Jason Lucas and Sergey Ten SourceLabs Dependable Open Source Systems ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] bitmap scans, btree scans, and tid order
Jeffrey W. Baker [EMAIL PROTECTED] writes: I see that Tom has already done the infrastructure work by adding getmulti, but getmulti isn't used by nodeIndexscan.c, only nodeBitmapIndexscan.c. Will btree index scans be executed by creating in-memory bitmaps in 8.1, or will some scans still be executed the usual way? We aren't going to remove the existing indexscan behavior, because bitmap scans lose the ordering of the underlying index. There are many situations where that ordering is important. (See for instance the recent changes to make MAX/MIN use that behavior.) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]