[HACKERS] smgr.c and smgrtype.c

2004-10-19 Thread Satoshi Nagayasu
Hello all,

I'm trying to modify the storage manager now.

I found a number of smgrs are stored separately in the storage manager
module (NSmgr in smgr.c and NStorageManagers in smgrtype.c),
and names of storage managers are stored in smgrtype.c.
Are there any reason for this?

I think they shold be stored in smgr.c and and interfaces
to get a name of smgr or a number of smgrs should be provided.

And the smgrid struct (in smgrtype.c) is (still) used?

My patch is available here:
http://snaga.org/pgsql/patches/smgr.patch

Any comments?

-- 
NAGAYASU Satoshi [EMAIL PROTECTED]
OpenSource Development Center,
NTT DATA Corp. http://www.nttdata.co.jp/


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] 7.4 changes

2004-10-19 Thread Neil Conway
On Tue, 2004-10-19 at 02:45, Andrew Dunstan wrote:
 *shrug* OK. Then plperl should probably not be regarded as being as 
 trusted as we would like. Note that old versions of Safe.pm  have been 
 the subject of security advisories such as this one 
 http://www.securityfocus.com/bid/6111/info/ for some time.

Perhaps a compromise would be to require the newer version of Safe.pm,
but leave the other changes for 8.0. Upgrading Safe.pm can presumably be
done without needing any changes to the rest of one's pl/perl code.

-Neil



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] smgr.c and smgrtype.c

2004-10-19 Thread Tom Lane
Satoshi Nagayasu [EMAIL PROTECTED] writes:
 I'm trying to modify the storage manager now.

Um ... why?

There is no doubt that the current smgr interface leaves a lot to be
desired, but the reason that it's in such sad shape is that there is
absolutely no modern-day use for an API at that particular level of
abstraction.  The stuff that the Berkeley boys and girls envisioned
doing here has all migrated down into the kernel, if not clear down
into the hardware (think RAID controller).  Most of the stuff that
people would now like to have an API separation for is at much higher
levels of abstraction.  For example, the smgr API doesn't even know what
a tuple or an index *is*, much less have the potential to modify lookup
or locking or replication semantics.

If anyone had wanted to add a new storage manager in the last fifteen
years, we'd doubtless have tried to clean this up some, but no one has
and I'm not really expecting anyone to try in the next fifteen...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] smgr.c and smgrtype.c

2004-10-19 Thread Satoshi Nagayasu

I'm trying to modify the storage manager now.
 
 Um ... why?

Because I want to add my new storage manager.
It is not just for (single) magnetic disk.

 If anyone had wanted to add a new storage manager in the last fifteen
 years, we'd doubtless have tried to clean this up some, but no one has
 and I'm not really expecting anyone to try in the next fifteen...

I guess clean separation and APIs are necessary for now.
If one need to modify/extend a storage manager for better performance,
clean design and APIs are essentials.

For my purpose, I'm thinking about a pluggable storage manager
mechanism using dynamic loading, because I want to develop
my smgr independently from main codebase...

-- 
NAGAYASU Satoshi [EMAIL PROTECTED]
OpenSource Development Center,
NTT DATA Corp. http://www.nttdata.co.jp/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Time off

2004-10-19 Thread Christopher Kings-Lynne
To stop everyone asking me - I will still be working on phpPgAdmin, no 
need to panic :)

Next release of phpPgAdmin should be at the same time as 8.0 PostgreSQL.
Chris
Christopher Kings-Lynne wrote:
Hi everyone,
I think I'll be taking some time off from the PostgreSQL project, to 
work on other stuff that has my interest more at the moment :)

I'll still be lurking around, but I won't really have much time to do 
actual coding.

Cheers,
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Managing Directories

2004-10-19 Thread Tham Paudel

Hallo!
I would like to know if there are any  discussions
about managing directories from Postgresql Datnbank
System. Or knows anybody how to manage it? 



Thanks 






___
Gesendet von Yahoo! Mail - Jetzt mit 100MB Speicher kostenlos - Hier anmelden: 
http://mail.yahoo.de

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS]

2004-10-19 Thread Ben Osborne



Postgres 7.0.2 
Problem
---

I am having a rather 
big problem with an installation of postgres 7.0.2 on cobalt, in that the db 
server is unable to see any of the data stored in the (only) database which is 
running (other than template1). I suspect that the files in the data 
directory have been conrruptedor otherwise lost integrity, possibly due to 
the fact that the disk partition where thedb cluster is running has 
reached 100% usage.

The problem is the 
-l doesn't list my database in the catalog, although i can psqlx into 
the database. Further, \d lists 'No Relations', however i can select * 
from a table, and although the database reportsno rows, thetable 
scheme does report correctly.

In terms of backup, 
all I have is a copy of the data directory itself. Unfortunately 
Iwas stupid enough not to have ever 
usedpg_dumpall.

My Question is, is 
there ANY means by which I can get at the data. I understand that in later 
versions there is a pg_resetxlog tool which can help with problems like this, 
but unfortunately not in 7.0.2. Does anybody know if there is an 
equivalent tool available for my version of the database? Alternatively, 
is there any other facility available which can extract the data from the data 
files in a form that I could work with?

I have laready tried 
re initialising the db on a larger disk partion and moving the data files 
around, but to no avail.

Any help would be 
very welcome!!

Regards.



Re: [HACKERS] Time off

2004-10-19 Thread Marc G. Fournier
Enjoy the break :)  Hints as to the 'other stuff' that is more intersting 
then PostgreSQL? :)  Or is it secret ... ?

On Tue, 19 Oct 2004, Christopher Kings-Lynne wrote:
To stop everyone asking me - I will still be working on phpPgAdmin, no need 
to panic :)

Next release of phpPgAdmin should be at the same time as 8.0 PostgreSQL.
Chris
Christopher Kings-Lynne wrote:
Hi everyone,
I think I'll be taking some time off from the PostgreSQL project, to work 
on other stuff that has my interest more at the moment :)

I'll still be lurking around, but I won't really have much time to do 
actual coding.

Cheers,
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
---(end of broadcast)---
TIP 8: explain analyze is your friend

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 7.4 changes

2004-10-19 Thread Andrew Dunstan

Neil Conway wrote:
On Tue, 2004-10-19 at 02:45, Andrew Dunstan wrote:
 

*shrug* OK. Then plperl should probably not be regarded as being as 
trusted as we would like. Note that old versions of Safe.pm  have been 
the subject of security advisories such as this one 
http://www.securityfocus.com/bid/6111/info/ for some time.
   

Perhaps a compromise would be to require the newer version of Safe.pm,
but leave the other changes for 8.0. Upgrading Safe.pm can presumably be
done without needing any changes to the rest of one's pl/perl code.
 

s/the rest of/any of/
Indeed it can.
The other thing I suggested was removing the :base_io set of ops - I 
would regard plperl functions that did things like printing to STDOUT as 
broken to start with.

But maybe we can just live with what we have and advertise that 8.0's 
plperl is more secure.

cheers
andrew
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] 7.4 changes

2004-10-19 Thread Alvaro Herrera
On Tue, Oct 19, 2004 at 08:47:20AM -0400, Andrew Dunstan wrote:

 But maybe we can just live with what we have and advertise that 8.0's 
 plperl is more secure.

The release notes should point out that 7.4's plperl is unsecure unless
the correct version of Safe.pm is installed.  Maybe it works to make it
croak if an unsafe version of Safe.pm is found?

I'm not sure about living with known security vulnerabilities.  What
about ISPs which give Pg hosting with plperl installed?  They surely
will want to know about this.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
One man's impedance mismatch is another man's layer of abstraction.
(Lincoln Yeoh)


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] Command-line parsing in pg_ctl is not portable

2004-10-19 Thread Peter Eisentraut
The command-line argument parsing in pg_ctl is not portable.  This is the 
output on a glibc system:

$ pg_ctl start stop
pg_ctl: too many command-line arguments (first is start)

But:

$ POSIXLY_CORRECT=1 pg_ctl start stop
pg_ctl: too many command-line arguments (first is stop)

This is probably because GNU getopt rearranges the arguments, and since pg_ctl 
uses two while loops to try to allow non-option arguments before options, 
things may get reordered multiple times.

Now this particular case is minor trouble, but I wonder in what other 
situations arguments will get reordered where the order does make a 
difference.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Open Items

2004-10-19 Thread Zeugswetter Andreas DAZ SD

 o fix shared memory on Win2k terminal server
  
 We might be able to just mark this as not supported.
 
 I have attached a patch that I think fixes this. The problem I saw 
 and fixed is, that the shmem created in a terminal services client is not 
 visible to the console (or services.msc).
 
 Does this actually fix the problem for you?
 Because, as I have previously posted I think, it does *not* solve the
 problem on any of my test machines. I still get the shmget() error
 message when running from a TS session.

I think you are having another problem. 
I can create it here (with or without the patch). I am running 
W2000 5.00.2195 SP4. Maybe you are having a permissions problem? 
I am using a user with near Administrator privs.

 Also, I don't really see how the visibility of the shmem segment
 matters.

If it really does not matter, please don't apply my patch Bruce.
(still do the rename though please)

 We can't *create* the first instance of it, which should not
 affect this at all. And if we passed that, all backends are still
 execute in the same session, so there is no effect on it.

Yes it only matters if postmaster is started/trying to start from different 
TS Sessions. I think we need to determine global existance of the shm segment 
to get rid of old processes/segments.

 services.msc only interacts with the SCM, it has nothing at all to do
 with shmem.

I meant if run as a service, which is the same TS session as the console.

Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Hypothetical Indexes

2004-10-19 Thread Thomas Swan
On 10/12/2004 4:02 PM Tom Lane could be overheard saying::
Marcos A Vaz Salles [EMAIL PROTECTED] writes:
 

In order to make index selection possible, we
have extended the PostgreSQL DBMS to allow the simulation of
hypothetical indexes. We believe these server extensions may be of
value for addition to the PostgreSQL code base.
   

This would be of some value if the optimizer's cost estimates were
highly reliable, but unfortunately they are far from being so :-(
Without the ability to measure *actual* as opposed to estimated
costs, I'm not sure you can really do much.
 

Is it possible for the backend to store performance data and try to 
modify its cost estimates?  I was thinking of the statistics analyzer 
currently in use and whether (query) performance data could piggy back 
on it or if it would need a different process, something that could try 
and correlate cost estimates with actual costs.  Given sample data the 
calculation could result in an actual execution time estimation.   Would 
it be worth pursuing, or would it be too invasive?

I know the autotune project has similar goals from a different angle, 
system/memory settings.

Thomas
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] gettext calls in pgport

2004-10-19 Thread Peter Eisentraut
Am Montag, 18. Oktober 2004 19:43 schrieb Tom Lane:
 An alternative possibility is to stop pretending that pgport is agnostic
 about whether it is in backend or frontend.  This might mean some
 duplication of code between src/port/ and src/backend/port/, but if
 that's what it takes to have sane error handling, that's what we should do.

The original plan for libpgport was to be a repository of functions that 
replace missing operating system functionality, like libiberty.  I would have 
have liked to be able to lift these functions into other projects without 
complications.  That implies that these functions should certainly not care 
about anything that by definition goes on above the operating system level.

Now the directory has grown into a sort of general repository of code that is 
shared between more than one part of the PostgreSQL source tree, without any 
regard for well-defined interfaces.  If you need to do that, please put it 
elsewhere, where only the involved parts see it.  Not now, but in the future.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS]

2004-10-19 Thread Andrew Sullivan
On Tue, Oct 19, 2004 at 01:28:52PM +0100, Ben Osborne wrote:
 Postgres 7.0.2 Problem
 ---

Yikes.  That's old. 

 (only) database which is running (other than template1).  I suspect that the
 files in the data directory have been conrrupted or otherwise lost
 integrity, possibly due to the fact that the disk partition where the db
 cluster is running has reached 100% usage.
  
 The problem is the -l doesn't list my database in the catalog, although i
 can psql x into the database.  Further, \d lists 'No Relations', however
 i can select * from a table, and although the database reports no rows, the
 table scheme does report correctly. 

I doubt that's your problem.  I suspect xid wraparound instead. 
Unfortunately, without a pg_dump, I suspect your data is
inaccessible  (see the current docs, section 21.1.3, for an
explanation of why this is.  I think the 7.0 docs don't contain all
that info, BTW).  We have a very dangerous tool we've used for testing
that will thump the xid in 7.2, but I have no idea whether that'd
work in versions prior to that.  Jan Wieck might know, though.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] spinlocks: generalizing non-locking test

2004-10-19 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Granted, but I think you've mostly conceded my point: every _subsequent_
 time TAS() is invoked, the non-locking test is a clear win (with the
 possible exception of PPC).

I'm not real sure.  One point here is that the standard advice about
this stuff is generally thinking in terms of an *extremely* tight spin
loop, ie

while (TAS(lock))
;

The loop in s_lock.c has a bit more overhead than that.  Also, because
we only use spinlocks to protect LWLocks, the expected hold time for a
spinlock is just a couple dozen instructions, which is probably less
than the expected time in most other uses of spinlocks.  So I think it's
less than clear that we should expect TAS to fail, even within the loop.

Basically I'd like to see some tests proving that there's actually any
value in it before we go complicating the assembly-code API ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Time off

2004-10-19 Thread Andreas Pflug
Marc G. Fournier wrote:
Enjoy the break :)  Hints as to the 'other stuff' that is more 
intersting then PostgreSQL? :)  Or is it secret ... ?
It's probably just a joke. Can you imagine something more interesting 
than PostgreSQL?!?

Regards,
Andreas
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] embedded postgresql

2004-10-19 Thread Steve Atkins
On Thu, Oct 14, 2004 at 09:49:47AM +0200, [EMAIL PROTECTED] wrote:
 Dear Sirs,
 
 I would like to know if there are any discussions about
 creating an embedded version on postgresql. My thoughts
 go towards building/porting a sqlite equivalent of pg.

The discussion comes up occasionally. After some well-reasoned
analysis the conclusion is generally that an embedded postgresql would
keep either none of the advantages of postgresql or would lose most of
the (single-user specific) advantages of an embedded database. Or
both.

In other words, postgresql is a really bad place to start if you want
to make an embedded database. If you search the list archives you
should find the last time this was discussed.

Cheers,
  Steve

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS]

2004-10-19 Thread Tom Lane
Ben Osborne [EMAIL PROTECTED] writes:
 Postgres 7.0.2 Problem
 
 I am having a rather big problem with an installation of postgres 7.0.2 on
 cobalt, in that the db server is unable to see any of the data stored in the
 (only) database which is running (other than template1).

The symptoms seem reasonably consistent with the theory that you have
suffered transaction ID wraparound.  How large is the $PGDATA/pg_log
file?  If it's exactly 1Gb then this is almost certainly the answer.

 My Question is, is there ANY means by which I can get at the data.

I believe it is possible to reset the transaction counter to something a
little bit less than 4 billion, which will make everything up to that
point appear to be in the past again.  I have long since forgotten the
details, but digging in the list archives should turn up some discussion
of how to do that in 7.0.  Then do a quick pg_dumpall, initdb and
reload.

You should seriously consider updating to a more modern PG version
while you are at it ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Using ALTER TABLESPACE in pg_dump

2004-10-19 Thread Tom Lane
Philip Warner [EMAIL PROTECTED] writes:
 To solve this, we should dump the table definition as a format string and 
 dump the tablespace clause (ie. ' TABLESPACE ts') as a separate part of the 
 table definition TOC entry. If the user wants the tablespace to be dumped, 
 then we substitute the tablespace clause, otherwise a blank string. This 
 could be a useful general approach in the future.

I think the tricky part of that would be inserting the tablespace clause
in the right place; for CREATE INDEX this seems to require nontrivial
parsing.  (Both the index column definitions and the WHERE clause could
be arbitrarily complicated expressions.)  If we can get around that part
then this wouldn't be too hard.

 Also, I like the option of a soft-tablespace option, but also liked the 
 idea of the fake/logical/virtual tablespaces someone suggested earlier; if 
 restoring into a database without a required tablespace, then create a 
 virtual tablespace that points to pg_default.

Given that tablespaces are fundamentally only directories, there isn't
any particularly strong reason to not just make a real tablespace.  You
aren't going to constrain space allocation or anything by having another
directory in/alongside $PGDATA.  So I think the virtual tablespace
idea is basically pointless.

The real crux of all this, I think, is what if I want to restore as
a non-superuser, and so I don't have privilege to create tablespaces
to match what the dump wants?  The soft-failure option provides an
answer here, but creating either real or virtual tablespaces wouldn't
fly.  A --notablespace option in pg_restore would solve it too, but
only if you'd done an -Fc or -Ft dump; with a plain text dump you
still got trouble.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Using ALTER TABLESPACE in pg_dump

2004-10-19 Thread Philip Warner
At 03:06 AM 20/10/2004, Tom Lane wrote:
I think the tricky part of that would be inserting the tablespace clause
in the right place; for CREATE INDEX this seems to require nontrivial
parsing.  (Both the index column definitions and the WHERE clause could
be arbitrarily complicated expressions.)  If we can get around that part
then this wouldn't be too hard.
I may be missing something here; I was assuming that pg_dump would dump 
would build the CREATE INDEX/TABLE/etc commands with the %%tablespace%% 
already embedded. pg_restore would not need to do any parsing. Or is there 
something I don't understand?




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 03 5330 3172  | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp.mit.edu:11371   |/ 

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Using ALTER TABLESPACE in pg_dump

2004-10-19 Thread Tom Lane
Philip Warner [EMAIL PROTECTED] writes:
 At 03:06 AM 20/10/2004, Tom Lane wrote:
 I think the tricky part of that would be inserting the tablespace clause
 in the right place; for CREATE INDEX this seems to require nontrivial
 parsing.  (Both the index column definitions and the WHERE clause could
 be arbitrarily complicated expressions.)  If we can get around that part
 then this wouldn't be too hard.

 I may be missing something here; I was assuming that pg_dump would dump 
 would build the CREATE INDEX/TABLE/etc commands with the %%tablespace%% 
 already embedded. pg_restore would not need to do any parsing. Or is there 
 something I don't understand?

Maybe there's something I don't understand.  How are you expecting
pg_restore to control whether it outputs the command with a TABLESPACE
clause embedded or not, if pg_dump has already built the command string
that way?  I thought you were envisioning that pg_restore would insert,
or not insert, a TABLESPACE clause into a command that didn't initially
have one.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Time off

2004-10-19 Thread Jan Wieck
On 10/19/2004 12:11 PM, Andreas Pflug wrote:
Marc G. Fournier wrote:
Enjoy the break :)  Hints as to the 'other stuff' that is more 
intersting then PostgreSQL? :)  Or is it secret ... ?
It's probably just a joke. Can you imagine something more interesting 
than PostgreSQL?!?
There comes the time in every hackers life when he discovers that even 
unsuccessfully chasing girls can be more fun than debugging kernel 
modules or interface libraries. Some get over that phase without greater 
collateral damage, some become successfull in the chasing, some then get 
caught by the upgrade policies of this quite different kind of hard- and 
software, and some even go that far that they experiment with its 
replication features ... and believe me, it takes a lot of time to get 
those replicas running :-)

Jan

Regards,
Andreas
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS]

2004-10-19 Thread Jan Wieck
On 10/19/2004 11:41 AM, Andrew Sullivan wrote:
On Tue, Oct 19, 2004 at 01:28:52PM +0100, Ben Osborne wrote:
Postgres 7.0.2 Problem
---
Yikes.  That's old. 

(only) database which is running (other than template1).  I suspect that the
files in the data directory have been conrrupted or otherwise lost
integrity, possibly due to the fact that the disk partition where the db
cluster is running has reached 100% usage.
 
The problem is the -l doesn't list my database in the catalog, although i
can psql x into the database.  Further, \d lists 'No Relations', however
i can select * from a table, and although the database reports no rows, the
table scheme does report correctly. 
I doubt that's your problem.  I suspect xid wraparound instead. 
Unfortunately, without a pg_dump, I suspect your data is
inaccessible  (see the current docs, section 21.1.3, for an
explanation of why this is.  I think the 7.0 docs don't contain all
that info, BTW).  We have a very dangerous tool we've used for testing
that will thump the xid in 7.2, but I have no idea whether that'd
work in versions prior to that.  Jan Wieck might know, though.

A
It's a user defined function in C that modifies this xid counter on 
call. But I don't think I would recommend mucking with DDL statements in 
a wrapped around DB ... that's not going anywhere.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] embedded postgresql

2004-10-19 Thread Josh Berkus
GB:

 I would like to know if there are any discussions about
 creating an embedded version on postgresql. My thoughts
 go towards building/porting a sqlite equivalent of pg.

Not that I personally know of.While it would be nice to have an embeddable 
database which was syntax-compatible with PostgreSQL, it would be so 
radically different codewise as to be a completely different project.   
Probably the only thing you'd re-use would be the query parser code.  At a 
guess, you'd probably be better off trying to weld out query parser to 
BerkeleyDB or something than to try to downsize the PG code.

If you do, though, please let us know, and feel free to host it at pgFoundry.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Time off

2004-10-19 Thread Joshua D. Drake
There comes the time in every hackers life when he discovers that even 
unsuccessfully chasing girls can be more fun than debugging kernel 
modules or interface libraries. Some get over that phase without greater 
collateral damage, some become successfull in the chasing, some then get 
caught by the upgrade policies of this quite different kind of hard- and 
software, and some even go that far that they experiment with its 
replication features ... and believe me, it takes a lot of time to get 
those replicas running :-)
Your telling me and we are not even legally allowed to use them as slaves ;)
Sincerely,
Joshua D. Drake


Jan

Regards,
Andreas
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



--
Command Prompt, Inc., home of PostgreSQL Replication, and plPHP.
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0334
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Using ALTER TABLESPACE in pg_dump

2004-10-19 Thread Philip Warner
At 03:25 AM 20/10/2004, Tom Lane wrote:
Maybe there's something I don't understand.  How are you expecting
pg_restore to control whether it outputs the command with a TABLESPACE
clause embedded or not, if pg_dump has already built the command string
that way?
This will only work if we modify the dump format (a new version) of 
dump/restore; the TOC entry for a table would have:

DEFINITION: CREATE TABLE fred ... %%tablespace%% ...
TABLESPACE: ' TABLESPACE t'
pg_restore would read these, and use the settings from the command line to 
either substitute an empty string or the TABLESPACE text for %%tablespace%% 
in the DEFINTION.

Same would apply for indexes etc.


Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 03 5330 3172  | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp.mit.edu:11371   |/ 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Using ALTER TABLESPACE in pg_dump

2004-10-19 Thread Tom Lane
Philip Warner [EMAIL PROTECTED] writes:
 DEFINITION: CREATE TABLE fred ... %%tablespace%% ...
 TABLESPACE: ' TABLESPACE t'

 pg_restore would read these, and use the settings from the command line to 
 either substitute an empty string or the TABLESPACE text for %%tablespace%% 
 in the DEFINTION.

Nope.  I can break that trivially, eg:

CREATE INDEX fooi ON foo (f1) WHERE upper(f1)  ' %%tablespace%%';

Not very probable, maybe, but you can't just do a blind sed-style
substitution.

There's also the nontrivial matter of how pg_dump would decide where to
insert the %%tablespace%% string into the CREATE INDEX command in the
first place.  If we're going to add code to parse CREATE INDEX and
insert the tablespace in the correct place, meseems it'd be better to
insert it on the pg_restore side.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Time off

2004-10-19 Thread Marc G. Fournier
On Tue, 19 Oct 2004, Jan Wieck wrote:
On 10/19/2004 12:11 PM, Andreas Pflug wrote:
Marc G. Fournier wrote:
Enjoy the break :)  Hints as to the 'other stuff' that is more intersting 
then PostgreSQL? :)  Or is it secret ... ?
It's probably just a joke. Can you imagine something more interesting than 
PostgreSQL?!?
There comes the time in every hackers life when he discovers that even 
unsuccessfully chasing girls can be more fun than debugging kernel 
modules or interface libraries. Some get over that phase without greater 
collateral damage, some become successfull in the chasing, some then get 
caught by the upgrade policies of this quite different kind of hard- and 
software, and some even go that far that they experiment with its 
replication features ... and believe me, it takes a lot of time to get 
those replicas running :-)
*rofl*  I can definitely relate to this one ... and, assuming that this is 
an accurate assessment of CKL's current situation ... most heartfelt 
congratulations to you and yours :)


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Time off

2004-10-19 Thread Andrew Rawnsley
On Oct 19, 2004, at 2:05 PM, Joshua D. Drake wrote:
There comes the time in every hackers life when he discovers that 
even unsuccessfully chasing girls can be more fun than debugging 
kernel modules or interface libraries. Some get over that phase 
without greater collateral damage, some become successfull in the 
chasing, some then get caught by the upgrade policies of this quite 
different kind of hard- and software, and some even go that far that 
they experiment with its replication features ... and believe me, it 
takes a lot of time to get those replicas running :-)
Your telling me and we are not even legally allowed to use them as 
slaves ;)

Its also an unusual replication scheme in that, more often than not, 
the slaves control the masters.


Sincerely,
Joshua D. Drake

Jan
Regards,
Andreas
---(end of 
broadcast)---
TIP 4: Don't 'kill -9' the postmaster

--
Command Prompt, Inc., home of PostgreSQL Replication, and plPHP.
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
jd.vcf
---(end of 
broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Andrew Rawnsley
President
The Ravensfield Digital Resource Group, Ltd.
(740) 587-0114
www.ravensfield.com
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[HACKERS] Possible make_oidjoins_check Security Issue

2004-10-19 Thread Rod Taylor
http://secunia.com/advisories/12860/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] CSS

2004-10-19 Thread Nurlan M. Mukhanov
Hello!

I wrote this css file 2 years ago. It's very useful when using docs.
Can you add it in mainstream?

Thanks.

Nurlan MukhanovBODY { font-family : Tahoma; font-size : 8pt; color : #33; background: #FF; }
A:LINK { color : #66; }
A:HOVER { color : #CC;text-decoration : underline;}
A:VISITED {color : #3366CC;}
HR {size: 1;color: #3366CC;width: 90%;text-align : center;}
TABLE {font-family: Tahoma;font-size: 8pt;}
h1 {font-family: Tahoma;color: #336600;font-size: 18pt;}
h3 {font-family: Tahoma;color: #339966;font-size: 14pt;border : thin none;}
h2 {font-family: Tahoma;color: #00;font-size: 16pt;font-weight: bold;}
.ABBREV {color:#993399; font-weight: bold;}
.ABSTRACT {color: #99; letter-spacing: 3; margin-left: 18; font-family: Times New Roman;font-size: 12pt; font-style : italic;}
.ACRONYM {color:#0066CC; font-weight: bold;}
.APPENDIX {}
.APPLICATION {color: #00; font-family: Courier New; font-weight: bold; font-size: 9pt}
.ATTRIBUTION {font-family: Arial; color: #003300; font-size: 8pt; font-weight: bold;}
.AUTHOR {color: #336600; font-weight: bold;}
.BIBLIODIV {text-align: center;}
.BIBLIOENTRY {}
.BIBLIOENTRYBLOCK {}
.BIBLIOGRAPHY {}
.BLOCKQUOTE {}
.BOOK {}
h3.BRIDGEHEAD {color: #5757AC}
.CALLOUTLIST {}
table.CALSTABLE {font-family: Tahoma; font-size: 8pt; color: #33; border: #4A4A4A 1px; border-collapse: collapse; border-style: solid; text-indent : 3px;}
table.CAUTION {background: #F; align: center; font-family: Tahoma; font-size: 8pt; color: #FF; border: #FF 1px solid; border-color: #FF 1px solid; border-collapse: collapse; border-style: solid; text-indent : 3px;}
.CHAPTER {}
.CITEREFENTRY {}
.CITETITLE {color: #DD; font-family: Times New Roman;font-size: 12pt; font-style: italic; font-weight: bold;}
.CLASSNAME {color: #FF; font-family: Courier; font-size: 10pt; font-weight: bold;}
.COMMAND {color: #0033CC;}
.COMPUTEROUTPUT {font-family: Courier New;font-size: 8pt;color: #66; font-weight: bold;}
.CONFDATES {color: #747474}
.CONFTITLE {color: #669966}
h1.COPYRIGHT {text-align: center;}
h3.CORPAUTHOR {text-align: center;}
.DATABASE {color:  #003366; font-weight: bold;}
.EMAIL {color: #996699}
.EMPHASIS {font-weight: bold; font-family: Times New Roman;font-size: 10pt; font-style: italic; color: #336699}
.ENVAR {font-family: Arial;font-weight: bold;font-size: 8pt;color:#990099;}
.ERRORNAME {color:#CC3300;font-weight: bold;}
.EXAMPLE {}
.FIGURE {}
.FILENAME {font-family: Courier;font-size: 10pt;color:#8C4600;font-weight: bold;}
.FIRSTTERM {font-family: Times New Roman;font-size: 10pt; font-style: italic; color: #CC6600;}
.FOOTNOTES {color: #CC;}
.FOREIGNPHRASE {font-family: Times New Roman;font-size: 10pt; font-style: italic; color: #009148; font-weight: bold;}
.FORMALPARA {}
.FUNCTION {color:  #CC3399;font-family: Times New Roman;font-size: 9pt; font-weight: bold;}
.GUIBUTTON {color: #33; font-weight: bold;}
.GUILABEL {color: #5E5E00; font-weight: bold;}
.GUIMENU {color: #CC9900; font-weight: bold;}
.GUIMENUITEM {color: #00AA80; font-weight: bold;}
.IMPORTANT {background-color: #FFF2F2}
h2.INDEXDIV {color: #6600FF; font-family: Times New Roman;font-size: 17pt; font-weight: bold; font-style: italic;}
.INFORMALEXAMPLE {}
.INFORMALFIGURE {}
.INFORMALTABLE {}
.KEYCAP {color: #FF0066;}
.LEGALNOTICE {border-color: gray 2px solid;}
.LINEANNOTATION {color: #66; font-family: Times New Roman;font-size: 10pt; font-weight: bold;}
.LITERAL {font-family: Courier;font-size: 10pt;color: #6600CC;}
.LITERALLAYOUT {color: #575739;}
.LOT {}
.MEDIAOBJECT {}
.MSG {}
.MSGENTRY {}
.MSGEXPLAN {color: #FF;}
.NAVFOOTER {}
.NAVHEADER {}
.NOTE {}
.OPTION {color: #0066FF; font-weight: bold;}
.OPTIONAL {font-family: Times New Roman;font-size: 9pt;font-style: italic; color: #0099CC;}
.PARA {}
.PARAMETER {color: #44;}
.PART {text-align: left;}
.PARTINTRO {}
.PREFACE {}
.PROCEDURE {}
.PRODUCTNAME {font-weight: bold;}
.PROGRAMLISTING {font-family: Fixedsys; color: #AF5F5F}
.PROMPT {color: #99;font-family: Courier New;font-weight: bold;}
.QUOTE {color: #99; font-weight: bold;}
.REFENTRY {}
.REFENTRYTITLE {font-weight: bold;}
.REFERENCE {}
.REFNAMEDIV {}
.REFSECT1 {}
.REFSECT2 {}
.REFSECT3 {}
.REFSYNOPSISDIV {}
.REPLACEABLE {font-family: Times New Roman;font-size: 12pt; color: #66}
.RETURNVALUE {color: #00;}
.SCREEN {font-family: Fixedsys;color: #00;}
.SECT1 {}
.SECT2 {}
.SECT3 {}
.SECT4 {}
.SET {}
.SETINDEX {}
.SGMLTAG {color: #009933; Times New Roman;font-size: 10pt; font-weight: bold;}
.STRUCTFIELD {color: #CC0099; Times New Roman;font-size: 9pt; font-style: italic;}
.STRUCTNAME {color: #0066FF; font-weight: bold;font-size: 10pt;}
.SUBSTEPS {color: #669900;}
.SYMBOL {color: #339933; font-weight: bold; font-size: 10pt;}
.SYNOPSIS {font-family: Courier;font-size: 10pt;color: #74744E;}
.SYSTEMITEM {font-weight: bold; color: #FF5959;}
.TABLE {}
.TIP {}
.TITLE {text-align: center;}
.TITLEPAGE {text-align: center;}
.TOC {text-align: left;}

Re: [HACKERS] Hypothetical Indexes

2004-10-19 Thread Marcos A Vaz Salles
Hello Josh,

  I will take a look at pgFoundry and register a new project for index
selection. I will also look for other projects there that we may help
somehow. About the tutorial, I will send you the presentation we used
to generate the web pages so that you can see it with OpenOffice.  ;)

  Thanks for your feedback,

Marcos.


On Wed, 13 Oct 2004 11:13:55 -0700, Josh Berkus [EMAIL PROTECTED] wrote:
 Marcos,
 
http://www.inf.puc-rio.br/~postgresql/
 
There you will find a link to a tutorial based description of the
  hypothetical indexes feature we have implemented on PostgreSQL 7.4
  beta 3.
 
 I would love to see this as an add-in project on pgFoundry.   Particularly
 since your online tutorial only works in Internet Explorer, so I can't read
 it.
 
 I believe that there was/is a team exploring a set of utilities to produce
 database optimization hints for the admin.   Your idea would dovetail
 nicely with that.
 
 --
 --Josh
 
 Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] DETERMINISTIC as synonym for IMMUTABLE

2004-10-19 Thread Simon Riggs
Tom Lane wrote
 In any case, whether or not you think DETERMINISTIC means IMMUTABLE,

Tom, Your knowledge of the confusing bits of the standard puts us all to
shame.

Troels did have a point, which was to do with standards conformance and
compatibility. The main point at issue is whether someone can run some ANSI
compliant SQL against PostgreSQL and have it work. That's a worthy goal.

AFAICS, your info shows that the standard's definition of DETERMINISTIC is
confusing and contradictory. Most people's interpretation would be that
DETERMINISTIC was the same as IMMUTABLE, so we should make the former a
synonym for the latter and document the possible difference of
interpretation. Seriously, if you can't put a blade of grass between them
then they're OK to be equated.

My understanding is that DETERMINISTIC in Oracle would work the same as
IMMUTABLE in PostgreSQL...

 I don't think it's very helpful to identify NOT DETERMINISTIC with
 VOLATILE.  As a counterexample, now() is NOT DETERMINISTIC, but it
 isn't VOLATILE.


You're spot on again with your info. NOT DETERMINISTIC means either STABLE
or VOLATILE in PostgreSQL terms, not just one of those.

IMHO we should allow the use of NOT DETERMINISTIC and document that although
it doesn't mean the same thing as VOLATILE, we should infer that meaning
because that is the mapping that is always correct. If the user wishes to
gain the possible performance advantages offered by STABLE, then they can
alter their code to do so. We're allowed to have performance enhancing
additions to the standard.

This is a similar situation to PostgreSQL's implementation of transaction
isolation levels. The implementation is both implemented according to the
standard and transactionally correct, yet READ UNCOMMITTED doesn't work
*exactly* as the standard says that level should, yet this is all clearly
documented and we are happy with that.

The standard ain't perfect, but we should get as close as possible and
document the difference - as long as there's no loss of correctness, which I
don't think is at issue here.

I'll submit a patch unless there is substantial disagreement.

Best Regards,

Simon Riggs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Simon Riggs
Mark Kirkwood wrote
  Tom Lane wrote:
 
 I believe that the term bitmap index is also used with a different
 meaning wherein it actually does describe a particular kind of on-disk
 index structure, with one bit per table row.
 
 IMHO building in-memory bitmaps (the first idea) is a very good idea to
 pursue for Postgres.  I'm not at all sold on on-disk bitmap indexes,
 though ... those I suspect *are* sufficiently replaced by partial
 indexes.
 

Well, if we could cache the bitmap after it was created the first time then
that might offer almost the same thing. :-)

I was thinking about this recently, then realised that building the bitmap
would not be as easily, since PostgreSQL doesn't index null values. That
would mean that the sets of CTIDs in each index would be disjoint. My
thinking about dynamic bitmaps came from Teradata, which does index null
values.

How would you dynamically build the bit maps from the indexes?

Or would you:
- copy aside and sort the indexes on CTID
- merge join them all to find matching CTIDs
- probe into the main table

Hopefully, I've missed something that you've thought of !

 I believe that the benefit of on-disk bitmap indexes is supposed to be
 reduced storage size (compared to btree).

 In the cases where I have put them to use, they certainly occupy
 considerably less disk than a comparable btree index - provided there
 are not too many district values in the indexed column.


The main problem is the need for the table to be read-only. Until we have
partitioning, we wouldn't be able to easily guarantee parts of a table as
being (effectively) read-only.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] tsearch2 windows make failure

2004-10-19 Thread Andrew Dunstan

Tom Lane wrote:
 

Info: resolving _my_exec_path by linking to __imp__my_exec_path (auto-import)
fu01.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname'
nmth00.o(.idata$4+0x0): undefined reference to `_nm__my_exec_path'
   

I was wondering whether my_exec_path might need to be marked DLLIMPORT.
Not sure about the other symbol though.
 

Is this going to be fixed? Right now tsearch2 is totally busted for 
Windows. I would fix it but my attempt (marking my_exec_path with 
DLLIMPORT) just seemed to make things worse (pgtz.c blew up).

cheers
andrew
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Alvaro Herrera
On Tue, Oct 19, 2004 at 11:22:31PM +0100, Simon Riggs wrote:

 I was thinking about this recently, then realised that building the bitmap
 would not be as easily, since PostgreSQL doesn't index null values. That
 would mean that the sets of CTIDs in each index would be disjoint. My
 thinking about dynamic bitmaps came from Teradata, which does index null
 values.

Huh, you are wrong.  At least btree does index null values, and one
other index method does too.  The other two index methods don't.  What
doesn't work is using an index with the IS NULL construct, because it's
not an operator.  Maybe that can be fixed by some other means ... some
parser magic perhaps.

 Or would you:
 - copy aside and sort the indexes on CTID
 - merge join them all to find matching CTIDs
 - probe into the main table

IIRC part of the trick was to build bitmaps to apply bitwise-AND/OR
operators.  This allows to use multiple indexes for one scan, for
example.


I don't understand your comment about read only tables ...

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
I call it GNU/Linux. Except the GNU/ is silent. (Ben Reiter)


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] tsearch2 windows make failure

2004-10-19 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I was wondering whether my_exec_path might need to be marked DLLIMPORT.
 Not sure about the other symbol though.

 Is this going to be fixed? Right now tsearch2 is totally busted for 
 Windows. I would fix it but my attempt (marking my_exec_path with 
 DLLIMPORT) just seemed to make things worse (pgtz.c blew up).

[ shrug ... ]  *I'm* not going to fix it; I don't have a Windows setup
I could test a proposed fix on.  And evidently it does need testing.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Mark Kirkwood
Simon Riggs wrote:

I believe that the benefit of on-disk bitmap indexes is supposed to be
reduced storage size (compared to btree).
   

The main problem is the need for the table to be read-only. Until we have
partitioning, we wouldn't be able to easily guarantee parts of a table as
being (effectively) read-only.
 

I don't believe that read only is required. The update/insert 
performance impact of bimap indexes is however very high (in Oracle's 
implementation anyway) - to the point where many sites drop them before 
adding in new data, and recreated 'em afterwards!

In the advent that there is a benefit for the small on-disk footprint, 
the insert/update throughput implications will need to be taken into 
account.

cheers
Mark
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I was thinking about this recently, then realised that building the bitmap
 would not be as easily, since PostgreSQL doesn't index null values.

As Alvaro already pointed out, this statement is bogus; and I'm not sure
what it has to do with the topic anyway.  All you care about is the rows
that the index fingers as matching your scan condition.  If the scan
condition is strict (which it usually is) it does not matter whether the
index stores entries for nulls or not.

 How would you dynamically build the bit maps from the indexes?

 Or would you:
 - copy aside and sort the indexes on CTID
 - merge join them all to find matching CTIDs
 - probe into the main table

I've been taking bitmap to be a rather handwavy way of saying a
compact representation of sets of CTIDs that is readily amenable to
being ANDed and ORed with other sets.  I don't think it'll be a pure
bitmap with no other superstructure; at the very least you'd want to
apply some sort of sparse-bitmap and/or compression techniques.  I do
suspect a bitmappy kind of representation will be more effective than
sorting arrays of CTIDs per se, although in principle you could do it
that way too.

But yeah, the basic idea is to scan an index and build some sort of
in-memory set of CTIDs of selected rows; possibly AND or OR this with
other sets built from other indexes; and then scan the set and probe
into the heap at the indicated places.  One huge advantage is that the
actual heap visiting becomes efficient, eg you never visit the same page
more than once.  (What you lose is the ability to retrieve data in
index order, so this isn't a replacement for existing indexscan methods,
just another plan type to consider.)

One interesting thought is that the bitmappy representation could be
lossy.  For instance, once you get to the point of needing to examine
most of the rows on a particular page, it's probably not worth
remembering exactly which rows; you could just remember that that whole
page is a target, and sequentially scan all the rows on it when you do
visit the heap.  (ANDing and ORing still works.)  This can scale up to
visiting consecutive ranges of pages; in the limit the operation
degenerates to a seqscan.  With this idea you can guarantee that the
in-memory bitmaps never get impracticably large.  (Obviously if they get
so large as to push the system into swapping, or even run the backend
out of memory completely, you lose, so this is a real nice guarantee to
be able to make.)  The whole thing starts to look like a self-adaptive
interpolation between our present indexscan and seqscan techniques,
which takes a lot of pressure off the planner to correctly guess the
number of matching rows in advance.

I remember batting these ideas around with people at the 2001 OSDB
summit conference ... I didn't think it would take us this long to get
around to doing it ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Josh Berkus
Tom,

 I've been taking bitmap to be a rather handwavy way of saying a
 compact representation of sets of CTIDs that is readily amenable to
 being ANDed and ORed with other sets.  

Well, actually I think we're talking about two different features:

1) a way to use more than one index per operation;
2) a more compact and thus faster index representation

The fact that Oracle solved both problems with the same code doesn't, 
obviously mean that we have to.   There's been a lot of discussion around 
problem (2) on this thread, but I don't want to lose sight of problem 
(1)  especially since that's the problem faced by several active 
community members right now.

You gave the impression that (1) could be implemented with regular BTree 
indexes in an earlier e-mail.   Would that be very hard to do?

 The whole thing starts to look like a self-adaptive 
 interpolation between our present indexscan and seqscan techniques,
 which takes a lot of pressure off the planner to correctly guess the
 number of matching rows in advance.

This would be way cool.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] CSS

2004-10-19 Thread Peter Eisentraut
Nurlan M. Mukhanov wrote:
 I wrote this css file 2 years ago. It's very useful when using docs.
 Can you add it in mainstream?

http://jigsaw.w3.org/css-validator/ should be your first stop.  After 
that, you can write to [EMAIL PROTECTED] with an explanation 
about what your stylesheet does and why you think it's a good idea.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Gavin Sherry
On Tue, 19 Oct 2004, Josh Berkus wrote:

 Tom,

  I've been taking bitmap to be a rather handwavy way of saying a
  compact representation of sets of CTIDs that is readily amenable to
  being ANDed and ORed with other sets.

 Well, actually I think we're talking about two different features:

 1) a way to use more than one index per operation;
 2) a more compact and thus faster index representation

For those interested, how this generally works is that for every distinct
value in the column being indexed, a bitmap of unique row identifiers (ie,
tids) is created. With compression, this can greatly reduce the size of
indexes on a large number of rows with a small number of distinct values
(a situation in which we're highly likely to use seq scan index of index
in Postgres).

For qualifications like: bitmapcol1 AND/OR bitmapcol2, we can use bitmap
and/or respectively. Of course, this is all in theory.

Bitmap indexes can suffer concurrency issues, depending on the granularity
of locking.

 You gave the impression that (1) could be implemented with regular BTree
 indexes in an earlier e-mail.   Would that be very hard to do?

  The whole thing starts to look like a self-adaptive
  interpolation between our present indexscan and seqscan techniques,
  which takes a lot of pressure off the planner to correctly guess the
  number of matching rows in advance.

 This would be way cool.

I think there's a lot of be gained by the technique above as an
alternative to our current access methods. Its just a feeling however, I
haven't prototyped this.

Thanks,

Gavin

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Simon Riggs
 Tom Lane
 Simon Riggs [EMAIL PROTECTED] writes:

  How would you dynamically build the bit maps from the indexes?

  Or would you:
  - copy aside and sort the indexes on CTID
  - merge join them all to find matching CTIDs
  - probe into the main table

 I've been taking bitmap to be a rather handwavy way of saying a
 compact representation of sets of CTIDs that is readily amenable to
 being ANDed and ORed with other sets.  I don't think it'll be a pure
 bitmap with no other superstructure; at the very least you'd want to
 apply some sort of sparse-bitmap and/or compression techniques.  I do
 suspect a bitmappy kind of representation will be more effective than
 sorting arrays of CTIDs per se, although in principle you could do it
 that way too.


OK. You seemed to be implying that.

 (What you lose is the ability to retrieve data in
 index order, so this isn't a replacement for existing indexscan methods,
 just another plan type to consider.)

Never seen an application that required a bitmap plan and sorted output.
Have you? Mostly count(*), often sum() or avg(), but never sorted, surely.

Considering there would always be 1 index, which index order did we want
anyhow?

 One interesting thought is that the bitmappy representation could be
 lossy.  For instance, once you get to the point of needing to examine
 most of the rows on a particular page, it's probably not worth
 remembering exactly which rows; you could just remember that that whole
 page is a target, and sequentially scan all the rows on it when you do
 visit the heap.  (ANDing and ORing still works.)  This can scale up to
 visiting consecutive ranges of pages; in the limit the operation
 degenerates to a seqscan.  With this idea you can guarantee that the
 in-memory bitmaps never get impracticably large.  (Obviously if they get
 so large as to push the system into swapping, or even run the backend
 out of memory completely, you lose, so this is a real nice guarantee to
 be able to make.)  The whole thing starts to look like a self-adaptive
 interpolation between our present indexscan and seqscan techniques,
 which takes a lot of pressure off the planner to correctly guess the
 number of matching rows in advance.

Well, thats the best one yet. That's the solution, if ever I heard it.

The reduction in bitmap size makes their use much safer. Size matters, since
we're likely to start using these techniques on very large databases, which
imply obviously have very large CTID lists. The problem with guessing the
number of rows is you're never too sure whether its worth the startup
overhead of using the bitmap technique. my next question was going to
be, so how will you know when to use the technique?

Hmmmthinkyou'd need to be clear that the cost of scanning a block
didn't make the whole thing impractical. Generally, since we're using this
technique to access infrequent row combinations, we'd be looking at no more
than one row per block usually anyway. So the technique is still I/O bound -
a bit extra post I/O cpu work won't hurt much. OK, cool.

 I remember batting these ideas around with people at the 2001 OSDB
 summit conference ... I didn't think it would take us this long to get
 around to doing it ...

...as if you haven't been busy... ;-)

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Simon Riggs
Alvaro Herrera
 On Tue, Oct 19, 2004 at 11:22:31PM +0100, Simon Riggs wrote:

  I was thinking about this recently, then realised that building the
bitmap
  would not be as easily, since PostgreSQL doesn't index null values. That
  would mean that the sets of CTIDs in each index would be disjoint. My
  thinking about dynamic bitmaps came from Teradata, which does index null
  values.

 Huh, you are wrong.

Always happy to learn. Thanks for letting me know.

 At least btree does index null values, and one
 other index method does too.  The other two index methods don't.  What
 doesn't work is using an index with the IS NULL construct, because it's
 not an operator.  Maybe that can be fixed by some other means ... some
 parser magic perhaps.

The manual says this (CREATE INDEX)
Indexes are not used for IS NULL clauses by default. The best way to use
indexes in such cases is to create a partial index using an IS NULL
comparison. 

Perhaps we can find a better way of wording this to explain what actually
occurs, which after your comments, I'm less clear on than I was before.
Could you clarify further, so we can update the documentation to be very
specific, or at least clearer.

  Or would you:
  - copy aside and sort the indexes on CTID
  - merge join them all to find matching CTIDs
  - probe into the main table

 IIRC part of the trick was to build bitmaps to apply bitwise-AND/OR
 operators.  This allows to use multiple indexes for one scan, for
 example.

Yes, an implication of my question was and would that then give greater
overhead for 2 indexes...

 I don't understand your comment about read only tables ...

These are restrictions on the Oracle implementation.

If you had a larger data warehouse table that grew over time, then typically
the older data wouldn't change much and so a read-only technique could be
sensibly applied.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Using ALTER TABLESPACE in pg_dump

2004-10-19 Thread Philip Warner
At 04:20 AM 20/10/2004, Tom Lane wrote:
Nope.  I can break that trivially, eg:
Thats why in my first message I mentioned escaping and unescaping all '%' 
in the deinition.


There's also the nontrivial matter of how pg_dump would decide where to
insert the %%tablespace%% string into the CREATE INDEX command in the
first place.
I'd vote against parsing, and add a parameter to get_indexdef.

If we're going to add code to parse CREATE INDEX and
insert the tablespace in the correct place, meseems it'd be better to
insert it on the pg_restore side.
But if we have to parse, I'd add it in pg_dump so all items that are 
relevant can be dumped with '%%tablespace%%'. pg_dump still constructs 
CREATE TABLE statements, so that is the natural place to add the tablespace 
marker and avoid parsing for tables.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 03 5330 3172  | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp.mit.edu:11371   |/ 

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Sailesh Krishnamurthy
 Tom == Tom Lane [EMAIL PROTECTED] writes:

Tom One huge advantage is that the actual heap visiting becomes
Tom efficient, eg you never visit the same page more than once.
Tom (What you lose is the ability to retrieve data in index
Tom order, so this isn't a replacement for existing indexscan
Tom methods, just another plan type to consider.)

Even without bitmap indexes, without trying to use multiple indexes
etc. this (visiting a page only once) is useful. 

In other words, I'd like to see the indexscan broken up into: (1) an
operator that returns a list of TIDs, (2) Sort the TIDs and (3) an
operator that fetches heap tuples from the sorted TID list. 

Of course the resulting data from the heap will be out of order but
that often times is less important than unnecessarily visiting (and
possibly even re-fetching from disk) the same heap page twice for a
given index scan. 

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Time off

2004-10-19 Thread Christopher Kings-Lynne
Enjoy the break :)  Hints as to the 'other stuff' that is more 
intersting then PostgreSQL? :)  Or is it secret ... ?
It's probably just a joke. Can you imagine something more interesting 
than PostgreSQL?!?
www.planeshift.it
(Sorry for the sucky flash intro :/)
I've been wanting to get into some 3d for a while...
Has a MySQL backend unfortunately - maybe I can convert them :)
Chris
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] plans for bitmap indexes?

2004-10-19 Thread Gavin Sherry
On Tue, 19 Oct 2004, Sailesh Krishnamurthy wrote:

  Tom == Tom Lane [EMAIL PROTECTED] writes:

 Tom One huge advantage is that the actual heap visiting becomes
 Tom efficient, eg you never visit the same page more than once.
 Tom (What you lose is the ability to retrieve data in index
 Tom order, so this isn't a replacement for existing indexscan
 Tom methods, just another plan type to consider.)

 Even without bitmap indexes, without trying to use multiple indexes
 etc. this (visiting a page only once) is useful.

 In other words, I'd like to see the indexscan broken up into: (1) an
 operator that returns a list of TIDs, (2) Sort the TIDs and (3) an
 operator that fetches heap tuples from the sorted TID list.

I'm uncertain about the potential benefit of this. Isn't/shouldn't the
effects of caching be assisting us here?

Gavin

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Possible make_oidjoins_check Security Issue

2004-10-19 Thread Neil Conway
On Wed, 2004-10-20 at 06:18, Rod Taylor wrote:
 http://secunia.com/advisories/12860/

This seems like a rather inconsequential problem, but it should be
fixed. The first two ideas that come to mind: use temporary files in
$PWD rather than /tmp, or create a subdirectory in /tmp to use for the
temporary files.

-Neil



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Time off

2004-10-19 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
  
 Has a MySQL backend unfortunately - maybe I can convert them :)
  
We might not let you back otherwise! :)
  
- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200410192349
  
-BEGIN PGP SIGNATURE-
 
iD8DBQFBdeDBvJuQZxSWSsgRAozdAJ49WWeoFclbLbh4102x5pebfV/JQwCfSbI9
gOKPqpXD7DfR+Ztrz2Bxwv8=
=5f8x
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Possible make_oidjoins_check Security Issue

2004-10-19 Thread Alvaro Herrera
On Wed, Oct 20, 2004 at 12:52:57PM +1000, Neil Conway wrote:
 On Wed, 2004-10-20 at 06:18, Rod Taylor wrote:
  http://secunia.com/advisories/12860/
 
 This seems like a rather inconsequential problem, but it should be
 fixed. The first two ideas that come to mind: use temporary files in
 $PWD rather than /tmp, or create a subdirectory in /tmp to use for the
 temporary files.

Better, use mktemp(1).  The thread testing script already does it IIRC.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Un poeta es un mundo encerrado en un hombre (Victor Hugo)


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Possible make_oidjoins_check Security Issue

2004-10-19 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Wed, 2004-10-20 at 06:18, Rod Taylor wrote:
 http://secunia.com/advisories/12860/

 This seems like a rather inconsequential problem,

Indeed, since ordinary users have no use for make_oidjoins_check.
It's surely very implausible that anyone would run it as root; and
even if someone did, the attacker cannot control what gets written.

 but it should be fixed. The first two ideas that come to mind: use
 temporary files in $PWD rather than /tmp, or create a subdirectory in
 /tmp to use for the temporary files.

I believe that the subdirectory idea is also vulnerable without great
care.

My inclination so far as the Red Hat packages are concerned is simply to
omit the contrib/findoidjoins files from the installed RPMs.

The patch originally proposed by trustix involved using mktemp(1), which
would be a great fix if mktemp(1) weren't so laughably unportable :-(
But in any case it's hard to see why we are expending RPM distro space
on this script in the first place.  I suspect that no one on the planet
except Bruce and myself have ever actually run this script.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Possible make_oidjoins_check Security Issue

2004-10-19 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Better, use mktemp(1).  The thread testing script already does it IIRC.

There are only two uses of mktemp(1) in our source tree: configure and
config.guess.  Both were gotten from elsewhere, and both jump through
some seriously unreadable hoops in order to achieve allegedly-portable
behavior.  mktemp(1) is simply not portable :-( ... the Single Unix Spec
refuses to touch it at all ...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Possible make_oidjoins_check Security Issue

2004-10-19 Thread Alvaro Herrera
On Wed, Oct 20, 2004 at 12:31:11AM -0400, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Better, use mktemp(1).  The thread testing script already does it IIRC.
 
 There are only two uses of mktemp(1) in our source tree: configure and
 config.guess.  Both were gotten from elsewhere, and both jump through
 some seriously unreadable hoops in order to achieve allegedly-portable
 behavior.

Huh, right.  I was remembering mkstemp(3), which is used in the thread
test (which is not a script after all ...)

config.guess usage surely is ugly ...

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?  (Mafalda)


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match