[Fwd: Re: [HACKERS] SQL99 Hierarchical queries]

2005-05-15 Thread Christopher Kings-Lynne
Looks like hierarchical queries are now officially stalled :(
Anyone want to take this up for 8.1?
Chris
 Original Message 
Subject: Re: [HACKERS] SQL99 Hierarchical queries
Date: Sun, 15 May 2005 07:31:16 +0400
From: Evgen Potemkin [EMAIL PROTECTED]
Reply-To: Evgen Potemkin [EMAIL PROTECTED]
To: Christopher Kings-Lynne [EMAIL PROTECTED]
References: [EMAIL PROTECTED]	 
[EMAIL PROTECTED]

Hi,
I haven't done any significant progress on that way because of lack of
free time.
Beside this, I'm recently changed my job and now I'm woking for MySQL.
I think it's not possible for me to continue work on PostgreSQL.
Feel free to take the patch and develop it further as long as you
mention me as author of initial version.
Regards, Evgen.
On 5/5/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote:
Hi Evgen,
I just keep pinging this patch thread every once in a while to make sure
it doesn't get forgotten :)
How is the syncing with 8.1 CVS coming along?
Chris
Evgen Potemkin wrote:
 Hi hackers!

 I have done initial implementation of SQL99 WITH clause (attached).
 It's now only for v7.3.4 and haven't a lot of checks and restrictions.
 It can execute only simple WITH queries like
 WITH tree AS (SELECT id,pnt,name,1::int AS level FROM t WHERE id=0
 UNION SELECT t.id,t.pnt,t.name,tree.level+1 FROM t JOIN tree ON
 tree.id=t.pnt ) SELECT * FROM tree;
 It would be great if someone with knowledge of Pg internals can make a
 kind of code review and make some advices how to better implement all
 this.

 Regards, Evgen.


 


 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] Planned change of ExecRestrPos API

2005-05-15 Thread Tom Lane
I'm planning to change ExecRestrPos and the routines it calls so that
an updated TupleTableSlot holding the restored-to tuple is explicitly
returned.

Currently, since nothing is explicitly done to the result Slot of a
plan node when we restore its position, you might think that the Slot
still points at the tuple that was current just before the Restore.
You'd be wrong though, at least for seqscan and indexscan plans
(I haven't looked yet at the other node types that support
mark/restore).  The reason is that the restore operation changes
the contents of a HeapTupleData struct in the scan state (rs_ctup
or xs_ctup) and all that the Slot really contains is a pointer to
that struct.

Now this is really bad.  In the first place, the Slot thinks it
has a pin on the buffer containing its current tuple.  After a
Restore, it may have pin on the wrong buffer.  It seems to be
sheer chance that we've not had bugs due to this.  (The underlying
scan does have pin on the right buffer, but one can easily imagine
sequences in which the scan could be cleared while the Slot is
still assumed valid.)  As of CVS tip the consequences could be
even worse, because the Slot may contain some pointers to extracted
fields of the tuple, and these pointers are now out of sync with
the tuple that the Slot really contains.

So I think that it's essential that we explicitly update the scan
result Slot during ExecRestrPos.

It seems to be a good idea also to make the function return the Slot.
As far as I can tell, nodeMergeJoin has been depending on the assumption
that the physical address of the result slot doesn't change during
Restore.  Which is true for all the current plan types, but since
the ExecProcNode API isn't designed to assume that a node always
returns the same Slot, it doesn't seem like ExecRestrPos should either.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-15 Thread Greg Stark
Bruce Momjian pgman@candle.pha.pa.us writes:

  Hmm.  That particular case will work, but the planner believes that only
  consecutive columns in the index are usable --- that is, if you have
  quals for a and c but not for b, it will think that the condition for c
  isn't usable with the index.  This is true for btree and gist indexes,
  so I suppose we'd need to introduce a pg_am column that tells what to
  do.
 
 We do have a TODO for this:
 
 * Use index to restrict rows returned by multi-key index when used with
   non-consecutive keys to reduce heap accesses
 
   For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
   col3 = 9, spin though the index checking for col1 and col3 matches,
   rather than just col1; also called skip-scanning.

That TODO is something else. 

Though it is related in that it is another example of why the existing code is
too simplistic and will eventually need to be enhanced. Not only would bitmap
indexes and (possibly) gist indexes, but even btree indexes would need to do
so if this TODO were implemented.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Views, views, views! (long)

2005-05-15 Thread Manfred Koizar
On Wed, 4 May 2005 21:37:40 -0700, Josh Berkus josh@agliodbs.com
wrote:
As stated above, these system views, once incorporated into a pg distribution, 
are likely to be with us *forever*.

I don't think that this is doable.  :-(

You might want to put the system views into a version specific schema,
say pg_views81.  The next PG version will contain a new schema
pg_views82 plus a version of 8.1 views that have been adapted to new
features and catalog structures as far as possible without breaking
compatibility.  Ideally the views in pg_views81 and pg_views82 will
look the same, but most likely there will be some differences.  In PG
8.3 we will have schemas pg_views81, pg_views82, and pg_views83 ...

Obviously it will get harder and harder to maintain older system view
schemas with each new Postgres version.  If in PG 8.7 it becomes clear
that carrying on pg_views81 doesn't make sense any more, you simply
drop it.  By that time tool vendors should have had enough time to
make their tools compatible with pg_views8x, for some x = 2.

Servus
 Manfred


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-15 Thread Manfred Koizar
On Thu, 12 May 2005 17:40:06 -0400, Tom Lane [EMAIL PROTECTED]
wrote:
the planner believes that only
consecutive columns in the index are usable --- that is, if you have
quals for a and c but not for b, it will think that the condition for c
isn't usable with the index.  This is true for btree [...]

It's not difficult to setup a test case where an index is used, but
only with a=42 as an index condition, and c='foo' is applied as a
filter condition.  Now add a redundant condition on b
... AND b BETWEEN minb AND maxb ...
and watch how c='foo' moves into the index condition.

I did this test some time ago and I believe that adding the condition
on b did not change the estimated cost, only the actual execution
time.

Servus
 Manfred


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-15 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Bruce Momjian pgman@candle.pha.pa.us writes:
 Hmm.  That particular case will work, but the planner believes that only
 consecutive columns in the index are usable --- that is, if you have
 quals for a and c but not for b, it will think that the condition for c
 isn't usable with the index.

 We do have a TODO for this:
 
 * Use index to restrict rows returned by multi-key index when used with
 non-consecutive keys to reduce heap accesses

 That TODO is something else. 

No, I think Bruce is right --- it's essentially the same thing.  It
certainly would be a good idea to change btrees to work like that,
if we are going to modify the planner to relax the restriction for
other index types.

I think it would be easy to change the planner and btree to handle
this (where easy means I remember where all the skeletons are
buried).  But I don't know the gist code hardly at all.  Can anyone
offer an informed opinion on whether gist can handle this now, and
if not what it would take to handle it?

(hash and rtree are not at issue since they don't support multi-key
indexes.)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Best way to scan on-disk bitmaps

2005-05-15 Thread Oleg Bartunov
On Sun, 15 May 2005, Tom Lane wrote:
Greg Stark [EMAIL PROTECTED] writes:
Bruce Momjian pgman@candle.pha.pa.us writes:
Hmm.  That particular case will work, but the planner believes that only
consecutive columns in the index are usable --- that is, if you have
quals for a and c but not for b, it will think that the condition for c
isn't usable with the index.
We do have a TODO for this:
* Use index to restrict rows returned by multi-key index when used with
non-consecutive keys to reduce heap accesses

That TODO is something else.
No, I think Bruce is right --- it's essentially the same thing.  It
certainly would be a good idea to change btrees to work like that,
if we are going to modify the planner to relax the restriction for
other index types.
I think it would be easy to change the planner and btree to handle
this (where easy means I remember where all the skeletons are
buried).  But I don't know the gist code hardly at all.  Can anyone
offer an informed opinion on whether gist can handle this now, and
if not what it would take to handle it?
I think that handling this in GiST is depends solely on how users consistent 
function designed to handle NULLs in keys. Other words, it should works as 
soon as users consistent function know what to do with NULLs in internal keys.

Teodor will comment multi-key GiST tomorrow.
We used Paul Aoki paper Generalizing ''Search'' in Generalized Search Trees,
(available from 
http://www.sai.msu.su/~megera/postgres/gist/papers/csd-97-950.pdf )
for implementation of multi-key GiST index support. It's true, that first
key is used for splitting, but elements with duplicated first key could
be shuffled to get better clustering on second key.
(hash and rtree are not at issue since they don't support multi-key
indexes.)
regards, tom lane
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Planned change of ExecRestrPos API

2005-05-15 Thread Tom Lane
I wrote:
 Currently, since nothing is explicitly done to the result Slot of a
 plan node when we restore its position, you might think that the Slot
 still points at the tuple that was current just before the Restore.
 You'd be wrong though, at least for seqscan and indexscan plans
 (I haven't looked yet at the other node types that support
 mark/restore).

Actually, on looking closer, only seqscans have this problem --- and
ExecSeqRestrPos is really dead code anyway at the moment.  So rather
than go through a large exercise to change the mark/restore API, I've
just added some comments about what the API actually guarantees, and
tweaked ExecSeqRestrPos to clear the output slot instead of leaving it
in a potentially inconsistent state.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [ADMIN] Permissions not removed when group dropped

2005-05-15 Thread Alvaro Herrera
[Moved to -hackers]

On Sat, May 14, 2005 at 11:32:23AM -0400, Tom Lane wrote:

 So what we've got [for DROP USER] is:
 
 1. Reject if any references to user from within other databases
 (implementation restriction).
 
 2. Reject if user owns any databases or tablespaces (safety feature).
 
 3. If CASCADE mode, drop any owned objects within the current database;
 if RESTRICT mode, reject if there are owned objects within the current
 database.
 
 4. Auto-drop any remaining references (ACLs and group memberships).
 
 5. Drop the user itself.

Sounds reasonable.

Additionally we need to think what should happen if the user is the
grantor of some privilege.  I think we should warn in RESTRICT mode, and
in CASCADE, revoke the privilege from the grantee.


 The main problem I see with this is that if you do have a user you want
 to get rid of who owns objects in multiple databases, it's still mighty
 hard to do it.  It'd be nice to have some kind of command that either
 drops or reassigns ownership of everything the user has in the current
 database.  Then you could use that repeatedly until you'd reached a
 point where DROP USER would work.

Hmm.  We could implement something like DROP USER LOCALLY [CASCADE |
RESTRICT], which would be a very misleading name for operations 2-4
above.  Additionally, if the user doesn't have references in other
databases, drop the user itself.  (Note it's inconsistent.)

For DROP TABLESPACE, I think we should also provide some sort of DROP
LOCALLY, that drops all objects in the current database.  (Suggestions
for the exact spelling of the option are welcome.)

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Planned change of ExecRestrPos API

2005-05-15 Thread Simon Riggs
On Sun, 2005-05-15 at 15:09 -0400, Tom Lane wrote:
 I'm planning to change ExecRestrPos and the routines it calls so that
 an updated TupleTableSlot holding the restored-to tuple is explicitly
 returned.
 
 Currently, since nothing is explicitly done to the result Slot of a
 plan node when we restore its position, you might think that the Slot
 still points at the tuple that was current just before the Restore.
 You'd be wrong though, at least for seqscan and indexscan plans
 (I haven't looked yet at the other node types that support
 mark/restore).  The reason is that the restore operation changes
 the contents of a HeapTupleData struct in the scan state (rs_ctup
 or xs_ctup) and all that the Slot really contains is a pointer to
 that struct.
 
 Now this is really bad.  In the first place, the Slot thinks it
 has a pin on the buffer containing its current tuple.  After a
 Restore, it may have pin on the wrong buffer.  

Sounds terrible. Is this a particular bug you're fixing?

 It seems to be
 sheer chance that we've not had bugs due to this.  

It isn't a very common case, thats why. You'd need to have value
duplicates in both joined columns, which is effectively a product join.
Granted it is syntactically allowable SQL.

AFAICS all joins should be between relations 1:1 or 1:M. A direct M:M
join is actually missing out the associative relation, or a non-key self
join. Such a join would rarely if ever have any correct and real
meaning. I can think of a few, but mostly its just incorrectly coded
SQL, or use of special values (e.g. blanks) rather than NULLs.

So my guess is that ExecRestrPos is almost never called.

 (The underlying
 scan does have pin on the right buffer, but one can easily imagine
 sequences in which the scan could be cleared while the Slot is
 still assumed valid.)  As of CVS tip the consequences could be
 even worse, because the Slot may contain some pointers to extracted
 fields of the tuple, and these pointers are now out of sync with
 the tuple that the Slot really contains.
 
 So I think that it's essential that we explicitly update the scan
 result Slot during ExecRestrPos.

Yeh, no problem.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [ADMIN] Permissions not removed when group dropped

2005-05-15 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Additionally we need to think what should happen if the user is the
 grantor of some privilege.  I think we should warn in RESTRICT mode, and
 in CASCADE, revoke the privilege from the grantee.

You mean fail in RESTRICT mode, no?

 Hmm.  We could implement something like DROP USER LOCALLY [CASCADE |
 RESTRICT], which would be a very misleading name for operations 2-4
 above.  Additionally, if the user doesn't have references in other
 databases, drop the user itself.  (Note it's inconsistent.)

I'd go for something more like DROP OWNED OBJECTS, which'd be just
the stuff internal to the current database (owned objects and ACL
entries).  You don't need to drop group memberships per-database.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] PostgreSQL running out of file handles

2005-05-15 Thread Christopher Kings-Lynne
I suppose you are running on some BSD variant?  BSD is notorious for
promising more than it can deliver with respect to number of open files
per process.  This is a kernel bug, not a Postgres bug.
You can adjust Postgres' max_files_per_process setting to compensate for
the kernel's lying about its capabilities.
(Postgres is in fact one of the most robust applications I know of
in terms of not going belly-up in response to EMFILE or ENFILE.
However, if there are not any spare descriptors to close, there's
not a lot we can do except fail.)
			regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [ADMIN] Permissions not removed when group dropped

2005-05-15 Thread Alvaro Herrera
On Sun, May 15, 2005 at 05:48:56PM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Additionally we need to think what should happen if the user is the
  grantor of some privilege.  I think we should warn in RESTRICT mode, and
  in CASCADE, revoke the privilege from the grantee.
 
 You mean fail in RESTRICT mode, no?

Yes, with a message indicating what happened.

  Hmm.  We could implement something like DROP USER LOCALLY [CASCADE |
  RESTRICT], which would be a very misleading name for operations 2-4
  above.  Additionally, if the user doesn't have references in other
  databases, drop the user itself.  (Note it's inconsistent.)
 
 I'd go for something more like DROP OWNED OBJECTS, which'd be just
 the stuff internal to the current database (owned objects and ACL
 entries).  You don't need to drop group memberships per-database.

Ok.

-- 
Alvaro Herrera (alvherre[a]surnet.cl)
Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever (Oliver Silfridge)

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] bitmap scans, btree scans, and tid order

2005-05-15 Thread Jeffrey W. Baker
About this time last year I was holding forth on the value of visiting
the heap in TID order, even when index scan tuples are randomly ordered.
Today I decided to start working on the problem stated in this TODO
item:

Fetch heap pages matching index entries in sequential order 

Rather than randomly accessing heap pages based on index
entries, mark heap pages needing access in a bitmap and do the
lookups in sequential order. Another method would be to sort
heap ctids matching the index before accessing the heap rows.

I see that Tom has already done the infrastructure work by adding
getmulti, but getmulti isn't used by nodeIndexscan.c, only
nodeBitmapIndexscan.c.  Will btree index scans be executed by creating
in-memory bitmaps in 8.1, or will some scans still be executed the usual
way?  If the former, I'd be wasting time, but in the latter case it
would be worth it.

-jwb

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [Fwd: Re: [HACKERS] SQL99 Hierarchical queries]

2005-05-15 Thread David Fetter
On Sun, May 15, 2005 at 04:44:57PM +0800, Christopher Kings-Lynne wrote:
 Looks like hierarchical queries are now officially stalled :(
 
 Anyone want to take this up for 8.1?

Sergei and Jason,

Feel like taking SQL:1999 WITH RECURSIVE?  It would be a giant help to
the PostgreSQL community. :)

http://gppl.moonbone.ru/index.shtml

has part of it, and

http://candle.pha.pa.us/mhonarc/patches2/msg00175.html

has others.

There's also MERGE, which is covered starting on page 47 of 
http://wiscorp.com/sql/SQL2003Features.pdf

also pp 839-845 of 5WD-02-Foundation-2003-09.pdf which is part of
this:
http://wiscorp.com/sql/sql_2003_standard.zip

and an overview here:
http://www.varlena.com/varlena/GeneralBits/73.php

Cheers,
D
 Hi,
 
 I haven't done any significant progress on that way because of lack of
 free time.
 Beside this, I'm recently changed my job and now I'm woking for MySQL.
 I think it's not possible for me to continue work on PostgreSQL.
 Feel free to take the patch and develop it further as long as you
 mention me as author of initial version.
 
 Regards, Evgen.
 
 On 5/5/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote:
 Hi Evgen,
 
 I just keep pinging this patch thread every once in a while to make sure
 it doesn't get forgotten :)
 
 How is the syncing with 8.1 CVS coming along?
 
 Chris
 
 Evgen Potemkin wrote:
  Hi hackers!
 
  I have done initial implementation of SQL99 WITH clause (attached).
  It's now only for v7.3.4 and haven't a lot of checks and restrictions.
  It can execute only simple WITH queries like
  WITH tree AS (SELECT id,pnt,name,1::int AS level FROM t WHERE id=0
  UNION SELECT t.id,t.pnt,t.name,tree.level+1 FROM t JOIN tree ON
  tree.id=t.pnt ) SELECT * FROM tree;
  It would be great if someone with knowledge of Pg internals can make a
  kind of code review and make some advices how to better implement all
  this.
 
  Regards, Evgen.
 
 
  
 
 
  ---(end of broadcast)---
  TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings

On Wed, May 11, 2005 at 07:01:50PM -0700, Sergey Ten wrote:
 Hello all,
 
 We would like to contribute to the Postgresql community by implementing 
 the following items from the TODO list 
 (http://developer.postgresql.org/todo.php):
 . Allow COPY to understand \x as a hex byte . Allow COPY to optionally 
 include column headings in the first line . Add XML output to COPY
 
 The changes are straightforward and include implementation of the 
 features as well as modification of the regression tests and documentation.
 
 Before sending a diff file with the changes, we would like to know if 
 these features have been already implemented.
 
 Best regards,
 Jason Lucas and Sergey Ten
 SourceLabs
 
 Dependable Open Source Systems
 
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] bitmap scans, btree scans, and tid order

2005-05-15 Thread Tom Lane
Jeffrey W. Baker [EMAIL PROTECTED] writes:
 I see that Tom has already done the infrastructure work by adding
 getmulti, but getmulti isn't used by nodeIndexscan.c, only
 nodeBitmapIndexscan.c.  Will btree index scans be executed by creating
 in-memory bitmaps in 8.1, or will some scans still be executed the usual
 way?

We aren't going to remove the existing indexscan behavior, because
bitmap scans lose the ordering of the underlying index.  There are many
situations where that ordering is important.  (See for instance the
recent changes to make MAX/MIN use that behavior.)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]