Re: [HACKERS] Documentation on page files

2002-04-22 Thread Hannu Krosing

On Tue, 2002-04-23 at 01:29, Martijn van Oosterhout wrote:
> 
> The dumping is more of an extra, the original idea was to check for errors
> in the datafiles. Hence the working name of "pgfsck". At the moment the
> dumping dumps only tuples where xmax == 0 but I'm not sure if that's
> correct.

AFAIK it is not. As Tom once explained me, it is ok for tuples xmax to
be !=0 and still have a valid tuple. The validity is determined by some
bits in tuple header.


But I think the most useful behaviour should be to dump system fields
too, so mildly knowledgeable sysadmin can import the dump and do the
right thing afterwards (like restore data as it was before transaction
nr 7000)

-
Hannu


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] "make report"

2002-04-22 Thread cbbrowne

> I'd like to implement *something* to help us collect information on what
> platforms actually have what features. This would be useful, for
> example, for figuring out whether any platforms are lacking 8 byte
> integers or are missing timezone infrastructure.
> 
> I was thinking about something like "make report" which would mail the
> results of ./configure to, say, the ports mailing list. We could mention
> it in the text message printed at the end of the make cycle.
> 
> Comments? Suggestions?

Suggestion:  Why not embed this information into the binary, and provide some 
way of extracting it.

(There's a Linux kernel option that allows something similar, so it wouldn't 
be something unprecedented.)

If all the config information is embedded in the binary, automatically, at 
compile time, then this allows the ability to be _certain_ that:

- "Oh, that was compiled with a really stupid set of compiler options; you'll 
have to recompile!"

- "That was compiled without support for FOO, but with support for BAR."

- "Announcement, people:  Look out for whether or not your distribution 
compiled PostgreSQL with proper support for 64 bit integers.  Several 
distributions got this wrong with the 7.4.17 release, and you can see if it's 
OK by looking for LONG_LONG_REVISED in the embedded configuration information."

[Downside:  "Announcement, script kiddies:  If you find option 
UPDATE_DESCR_TABS=1 in the configuration information, then there's a very easy 
root exploit..."]
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/x.html
Rules of  the Evil  Overlord #176.  "I will add  indelible dye  to the
moat. It won't  stop anyone from swimming across,  but even dim-witted
guards should be  able to figure out when someone  has entered in this
fashion." 

-- 
(concatenate 'string "cbbrowne" "@acm.org")
http://www3.sympatico.ca/cbbrowne/spiritual.html
Including a destination in the CC list that will cause the recipients'
mailer to blow out is a good way to stifle dissent.
-- from the Symbolics Guidelines for Sending Mail





msg16196/pgp0.pgp
Description: PGP signature


[HACKERS] "make report"

2002-04-22 Thread Thomas Lockhart

I'd like to implement *something* to help us collect information on what
platforms actually have what features. This would be useful, for
example, for figuring out whether any platforms are lacking 8 byte
integers or are missing timezone infrastructure.

I was thinking about something like "make report" which would mail the
results of ./configure to, say, the ports mailing list. We could mention
it in the text message printed at the end of the make cycle.

Comments? Suggestions?

- Thomas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] I am back

2002-04-22 Thread Bruce Momjian

I was in Boston for a few days for a wedding.  Never got time to be
online.  I am back now.  I will read my email and apply outstanding
patches tomorrow.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Simplifying OID lookups in the presence of namespaces

2002-04-22 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Also, for functions and operators the name alone is not sufficient to
>> uniquely identify the object.  Type regproc currently throws an error
>> if asked to convert a nonunique function name; that severely limits its
>> usefulness.  I'm toying with allowing datatypes in the input string,
>> eg
>> 'sum(bigint)'::regproc
>> but I wonder if this will create compatibility problems.  In particular,
>> should the regproc and regoperator output converters include datatype
>> indicators in the output string?  (Always, never, only if not unique?)

> I'd be inclined to include datatype always. If you don't, how can you 
> use this for pg_dump, etc?

pg_dump would probably actually prefer not having type info in the
output string; it'll just have to strip it off in most places.  But
I don't have a good feeling for the needs of other applications,
so I was asking what other people thought.

If we supported both ways via two datatypes, we'd have all the bases
covered; I'm just wondering if it's worth the trouble.

regards, tom lane

PS: interesting thought about enum ...

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Simplifying OID lookups in the presence of namespaces

2002-04-22 Thread Joe Conway

Tom Lane wrote:
> A variant of the idea of inventing functions is to extend the existing
> datatype 'regproc' to do this, and invent also 'regclass', 'regtype',
> 'regoperator' datatypes to do the lookups for the other object kinds.
> I proposed this in a different context last year,
>   http://archives.postgresql.org/pgsql-hackers/2001-08/msg00589.php
> but it seemed too late to do anything with the idea for 7.2.
> 

Interesting thread. It seems like the same basic facility could also 
support an enum datatype that people migrating from mysql are always 
looking for.



> One question is what to do with invalid input.  For example, if table
> foo doesn't exist then what should 'foo'::regclass do?  The existing
> regproc datatype throws an error, but I wonder whether it wouldn't be
> more useful to return NULL.  Any thoughts on that?

NULL makes sense.

> 
> Also, for functions and operators the name alone is not sufficient to
> uniquely identify the object.  Type regproc currently throws an error
> if asked to convert a nonunique function name; that severely limits its
> usefulness.  I'm toying with allowing datatypes in the input string,
> eg
>   'sum(bigint)'::regproc
> but I wonder if this will create compatibility problems.  In particular,
> should the regproc and regoperator output converters include datatype
> indicators in the output string?  (Always, never, only if not unique?)

I'd be inclined to include datatype always. If you don't, how can you 
use this for pg_dump, etc?


Joe


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] What is wrong with hashed index usage?

2002-04-22 Thread Michael Loftis

The benchmarks will depend mostly on the depth of the Btree.   Hashes 
will be markedly faster only in the case(s) where descending into the 
tree to produce a matching leaf node would take longer than walking to 
the appropriate item in a hash.

Most of the time until the btree gets deep they are nearly equivalent. 
 When the tree ends up becoming many levels deep it can take longer to 
walk than the hash.

Neil Conway wrote:

>On Mon, 22 Apr 2002 15:04:22 -0700
>"Dann Corbit" <[EMAIL PROTECTED]> wrote:
>
>>Here is where a hashed index shines:
>>To find a single item using a key, hashed indexes are enormously faster
>>than a btree.
>>
>>That is typically speaking.  I have not done performance benchmarks with
>>PostgreSQL.
>>
>
>Yes -- but in the benchmarks I've done, the performance different
>is not more than 5% (for tables with ~ 600,000 rows, doing lookups
>based on a PK with "="). That said, my benchmarks could very well
>be flawed, I didn't spend a lot of time on it. If you'd like to
>generate some interest in improving hash indexes, I'd like to see
>some empirical data supporting your performance claims.
>
>Cheers,
>
>Neil
>



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Documentation on page files

2002-04-22 Thread Martijn van Oosterhout

On Mon, Apr 22, 2002 at 11:14:36AM -0500, Ross J. Reedstrom wrote:
> On Sun, Apr 21, 2002 at 07:28:32PM +1000, Martijn van Oosterhout wrote:
> > 
> > http://svana.org/kleptog/pgsql/page.sgml.txt
> > 
> > I don't know whatever SGML format this is using, so the layout is not great,
> > but the information should be accurate. I used it to create a program to
> > dump the datafiles directly without the postmaster :).
> 
> Excellent - since this is a FRP (Frequently Requested Program) how do you
> feel about dumping it in contrib? Even if it's hardcoded for your particular
> table structure, it could serve as a starting point for some poor DBA
> who's got to recover from a lost xlog, for example.

Actually, it reads the table structure from the catalog. It also will find
the right files to open. It reads files from both PG 6.5 and 7.2 although it
shouldn't be too hard to make work for other versions. And if you people
don't reorder the first few fields in pg_attribute, it will work for all
future versions too.

The dumping is more of an extra, the original idea was to check for errors
in the datafiles. Hence the working name of "pgfsck". At the moment the
dumping dumps only tuples where xmax == 0 but I'm not sure if that's
correct.

It doesn't handle compressed tuples nor toasted ones, though thats more
advanced really. And ofcourse outputing data in human readable format has to
be added for each type. I only started writing it on Sunday, so let me give
it a usable interface and I'll let people try it out.

-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[HACKERS] NAMEDATALEN revisited

2002-04-22 Thread Rod Taylor

Using current CVS (yesterday) I've rerun the benchmarks to see the
effects of various NAMEDATALEN settings.

3 times per setting.


First time is pgbench inserts (-s 5)
Second time is pgbench run (-t 3000 -s 5)

Third time is the postmaster during both of the above.

I'll run it again tonight on a computer with faster disk I/O to see if
that helps descrease the sys times.

-
NAMEDATALEN: 32

   89.34 real 1.85 user 0.13 sys
  146.87 real 1.51 user 3.91 sys

  246.63 real66.11 user19.21 sys


NAMEDATALEN: 64

   93.10 real 1.82 user 0.16 sys
  147.30 real 1.45 user 3.90 sys

  249.28 real66.01 user18.82 sys


NAMEDATALEN: 128

   99.13 real 1.80 user 0.51 sys
  169.47 real 1.87 user 4.54 sys

  279.16 real67.93 user29.72 sys


NAMEDATALEN: 256

  106.60 real 1.81 user 0.43 sys
  166.61 real 1.69 user 4.25 sys

  283.76 real66.88 user26.59 sys




NAMEDATALEN: 32

   89.34 real 1.85 user 0.13 sys
  146.87 real 1.51 user 3.91 sys

  246.63 real66.11 user19.21 sys


NAMEDATALEN: 64

   93.10 real 1.82 user 0.16 sys
  147.30 real 1.45 user 3.90 sys

  249.28 real66.01 user18.82 sys


NAMEDATALEN: 128

   99.13 real 1.80 user 0.51 sys
  169.47 real 1.87 user 4.54 sys

  279.16 real67.93 user29.72 sys


NAMEDATALEN: 256

  106.60 real 1.81 user 0.43 sys
  166.61 real 1.69 user 4.25 sys

  283.76 real66.88 user26.59 sys


NAMEDATALEN: 512

   88.13 real 1.83 user 0.22 sys
  160.77 real 1.64 user 4.48 sys

  259.56 real67.54 user21.89 sys




#!/bin/sh

PGSRC=/home/rbt/postgresql/pgsqlnamedatalen
PGBASEPORT=6400
PGBASEBIN=/home/rbt/postgresql/dbname

LOG=/home/rbt/bench_namedatalen.log


for newDATALEN in 32 64 128 256 512 ; do

  PGBIN=${PGBASEBIN}_${newDATALEN}
  PGPORT=`echo "${PGBASEPORT}+${newDATALEN}" | bc`

  sed -E 's/NAMEDATALEN\s[0-9]+/NAMEDATALEN ${newDATALEN}/g' ${PGSRC}/src/include/postgres_ext.h > ${PGSRC}/src/include/postgres_ext.h.tmp
  mv ${PGSRC}/src/include/postgres_ext.h.tmp ${PGSRC}/src/include/postgres_ext.h

  cd ${PGSRC}
  ./configure --prefix=${PGBIN} --with-pgport=${PGPORT} || (echo "UNABLE TO CONFIGURE"; exit)

  make clean
  make clean install

  cd ${PGSRC}/contrib/pgbench

  gmake install


  ${PGBIN}/bin/initdb -D ${PGBIN}/data  || (echo "UNABLE TO INITIALIZE"; exit 1)

  time -a -o ${LOG} ${PGBIN}/bin/postmaster -D ${PGBIN}/data -F &

  sleep 5

  echo "NAMEDATALEN: ${newDATALEN}" >> ${LOG}
  echo  >> ${LOG}
  time -a -o ${LOG} ${PGBIN}/bin/pgbench -i -s 5 -d template1 -p ${PGPORT}

  time -a -o ${LOG} ${PGBIN}/bin/pgbench -t 3000 -s 5 -d template1 -p ${PGPORT}
  echo >> ${LOG}
  ${PGBIN}/bin/pg_ctl -D ${PGBIN}/data stop
  echo >> ${LOG}
  echo >> ${LOG}
  rm -rf ${PGBIN}
done



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] ecpg/preproc.y is generating reduce/reduce conflicts

2002-04-22 Thread Tom Lane

[tgl@rh1 preproc]$ make
bison -y -d  preproc.y
conflicts:  2 reduce/reduce

This is not good.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] What is wrong with hashed index usage?

2002-04-22 Thread Neil Conway

On Mon, 22 Apr 2002 15:04:22 -0700
"Dann Corbit" <[EMAIL PROTECTED]> wrote:
> Here is where a hashed index shines:
> To find a single item using a key, hashed indexes are enormously faster
> than a btree.
> 
> That is typically speaking.  I have not done performance benchmarks with
> PostgreSQL.

Yes -- but in the benchmarks I've done, the performance different
is not more than 5% (for tables with ~ 600,000 rows, doing lookups
based on a PK with "="). That said, my benchmarks could very well
be flawed, I didn't spend a lot of time on it. If you'd like to
generate some interest in improving hash indexes, I'd like to see
some empirical data supporting your performance claims.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] What is wrong with hashed index usage?

2002-04-22 Thread Dann Corbit

> -Original Message-
> From: Neil Conway [mailto:[EMAIL PROTECTED]]
> Sent: Monday, April 22, 2002 2:59 PM
> To: Dann Corbit
> Cc: [EMAIL PROTECTED]
> Subject: Re: [HACKERS] What is wrong with hashed index usage?
> 
> 
> On Mon, 22 Apr 2002 14:15:37 -0700
> "Dann Corbit" <[EMAIL PROTECTED]> wrote:
> > From here:
> > http://osdb.sourceforge.net/
> > We find this quote:
> > "For you long-suffering OSDB PostgreSQL users, we offer 
> > 
> > --postgresql=no_hash_index 
> > 
> > to work around the hash index problems of OSDB with 
> PostgreSQL V7.1 and
> > 7.2. As always, let us know of any problems. May the source be with
> > you!"
> > 
> > Does anyone know what the above is all about?
> 
> Yes -- search the list archives, or check the PostgreSQL 
> docs. This problem
> has been brought up several times: hash indexes deadlock 
> under concurrent
> load. A run of pgbench with a reasonably high concurrency 
> level (10 or 15)
> produces the problem consistently.
> 
> Previously, I had volunteered to fix this, but
> 
> (a) I'm busy with the PREPARE/EXECUTE stuff at the moment.
> 
> (b) I'm not sure it's worth the investment of time: AFAIK,
> hash indexes don't have many advantages over btrees for
> scalar data.
> 
> On the other hand, if someone steps forward with some data on a
> specific advantage that hash indexes have over btrees, I don't
> expect that the concurrency problems should be too difficult to
> solve.

Here is where a hashed index shines:
To find a single item using a key, hashed indexes are enormously faster
than a btree.

That is typically speaking.  I have not done performance benchmarks with
PostgreSQL.

In general, hashed indexes are much to be preferred when you are doing
frequent keyed lookups for single items.  Hashed indexes are (of course)
completely useless for an ordered scan or for wide ranges of continuous
data.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] What is wrong with hashed index usage?

2002-04-22 Thread Neil Conway

On Mon, 22 Apr 2002 14:15:37 -0700
"Dann Corbit" <[EMAIL PROTECTED]> wrote:
> From here:
> http://osdb.sourceforge.net/
> We find this quote:
> "For you long-suffering OSDB PostgreSQL users, we offer 
> 
> --postgresql=no_hash_index 
> 
> to work around the hash index problems of OSDB with PostgreSQL V7.1 and
> 7.2. As always, let us know of any problems. May the source be with
> you!"
> 
> Does anyone know what the above is all about?

Yes -- search the list archives, or check the PostgreSQL docs. This problem
has been brought up several times: hash indexes deadlock under concurrent
load. A run of pgbench with a reasonably high concurrency level (10 or 15)
produces the problem consistently.

Previously, I had volunteered to fix this, but

(a) I'm busy with the PREPARE/EXECUTE stuff at the moment.

(b) I'm not sure it's worth the investment of time: AFAIK,
hash indexes don't have many advantages over btrees for
scalar data.

On the other hand, if someone steps forward with some data on a
specific advantage that hash indexes have over btrees, I don't
expect that the concurrency problems should be too difficult to
solve.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Implement a .NET Data

2002-04-22 Thread Francisco Jr.

 
 
 Thanks Jean.
 
 I will send a message to the ODBC list.
 At least I didn't create the project in sourceforge
 yet. I will try to get a cvs account at
 Postgresql.org
 as you said. :)
 
 Thanks very much!!!

Francisco Jr.

___
Yahoo! Empregos
O trabalho dos seus sonhos pode estar aqui. Cadastre-se hoje mesmo no Yahoo! Empregos 
e tenha acesso a milhares de vagas abertas!
http://br.empregos.yahoo.com/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] What is wrong with hashed index usage?

2002-04-22 Thread Dann Corbit

From here:
http://osdb.sourceforge.net/
We find this quote:
"For you long-suffering OSDB PostgreSQL users, we offer 

--postgresql=no_hash_index 

to work around the hash index problems of OSDB with PostgreSQL V7.1 and
7.2. As always, let us know of any problems. May the source be with
you!"

Does anyone know what the above is all about?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Simplifying OID lookups in the presence of namespaces

2002-04-22 Thread Tom Lane

We are about to need to fix a fair number of places in client code
(eg, psql and pg_dump) that presently do things like

SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'foo');

This does not work reliably anymore because there could be multiple
relations named 'foo' in different namespaces.  The sub-select to
get the relation OID will fail because it'll return multiple results.

The brute-force answer is

SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'foo'
  AND relnamespace = (SELECT oid FROM pg_namespace
  WHERE nspname = 'bar'));

But aside from being really ugly, this requires that the client code
know exactly which namespace contains the relation it's after.  If
the client is relying on namespace search then it may not know that;
in fact, the client code very possibly isn't even aware of the exact
namespace search path it's using.  I am planning to introduce an
informational function CURRENT_SCHEMAS() (or some such name) that
returns the current effective search path, probably as a NAME[] array.
But it looks really, really messy to write an SQL query that makes
use of such a function to look up the first occurrence of 'foo' in
the search path.  We need to encapsulate the lookup procedure somehow
so that we don't have lots of clients reinventing this wheel.

We already have some functions that accept a text string and do a
suitable lookup of a relation; an example is nextval(), for which
you can presently write

nextval('foo')  --- searches namespace path for foo
nextval('foo.bar')  --- looks only in namespace foo
nextval('"Foo".bar')--- quoting works for mixed-case names

Seems like what we want to do is make the lookup part of this available
separately, as a function that takes such a string and returns an OID.
We'd need such functions for each of the namespace-ified object kinds:
relations, datatypes, functions, and operators.

A variant of the idea of inventing functions is to extend the existing
datatype 'regproc' to do this, and invent also 'regclass', 'regtype',
'regoperator' datatypes to do the lookups for the other object kinds.
I proposed this in a different context last year,
http://archives.postgresql.org/pgsql-hackers/2001-08/msg00589.php
but it seemed too late to do anything with the idea for 7.2.

If we went with the datatype approach then we'd be able to write
queries like

SELECT * FROM pg_attribute WHERE attrelid = 'foo'::regclass;

or

SELECT * FROM pg_attribute WHERE attrelid = 'foo.bar'::regclass;

or for that matter you could do

SELECT * FROM pg_attribute WHERE attrelid = regclass('foo');

which'd be syntactically indistinguishable from using a function.

The datatype approach seems a little bit odder at first glance, but it
has some interesting possibilities with respect to implicit casting
(see above-referenced thread).  So I'm inclined to go that route unless
someone's got an objection.

With a datatype, we also have outbound conversion to think of: so there
must be a function that takes an OID and produces a string.  What I am
inclined to do on that side is emit an unqualified name if the OID
refers to a relation/type/etc that would be found first in the current
namespace search path.  Otherwise, a qualified name (foo.bar) would be
emitted.  This will have usefulness for applications like pg_dump, which
will have exactly this requirement (per discussion a few days ago that
pg_dump should not qualify names unnecessarily).

One question is what to do with invalid input.  For example, if table
foo doesn't exist then what should 'foo'::regclass do?  The existing
regproc datatype throws an error, but I wonder whether it wouldn't be
more useful to return NULL.  Any thoughts on that?

Also, for functions and operators the name alone is not sufficient to
uniquely identify the object.  Type regproc currently throws an error
if asked to convert a nonunique function name; that severely limits its
usefulness.  I'm toying with allowing datatypes in the input string,
eg
'sum(bigint)'::regproc
but I wonder if this will create compatibility problems.  In particular,
should the regproc and regoperator output converters include datatype
indicators in the output string?  (Always, never, only if not unique?)
Doing so would be a non-backwards-compatible change for regproc.
We might avoid that complaint by leaving regproc as-is and instead
inventing a parallel datatype (say regfunction) that supports datatype
indications.  But I'm not sure whether regproc is used enough to make
this an important concern.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Documentation on page files

2002-04-22 Thread Patrick Macdonald

Martijn,

It may be useful to look at the pg_filedump utility located
at http://sources.redhat.com/rhdb/tools.html 

This utility dumps out information at the page level and is
commented to help the user understand the format/content of
PostgreSQL heap/index/control files.

Cheers,
Patrick
-
Patrick Macdonald
Red Hat Database


Tom Lane wrote:
> 
> Martijn van Oosterhout <[EMAIL PROTECTED]> writes:
> > Chapter 7 of the Developers guide in about the Page Format on disk and it's
> > a little out of date not to mention somewhat incomplete.
> 
> Indeed, this seems to have very little relation to reality :-(.
> I didn't even realize that we had such a description in the SGML docs.
> It's obviously not been updated for many years.  I'm not sure if the
> "continuation" mechanism it describes ever existed at all, but it sure
> hasn't been there since the code left Berkeley.
> 
> > 1. Is there documentation elsewhere (other than the source)?
> 
> Not that I can think of.  The most accurate information seems to be in
> src/include/storage/bufpage.h; AFAICT all the comments in that file are
> up-to-date.  In addition to this it'd be worth pulling out some
> description of the "special space" structures used by the various index
> access methods.
> 
> > 2. If not, would patches be accepted to correct the situation?
> 
> Go for it.
> 
> regards, tom lane
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] [RFC] Set Returning Functions

2002-04-22 Thread Joe Conway

I've been reading past threads, studying backend code, reviewing Alex 
Pilosov's "cursor foo" patch (submitted last August/September, but never 
applied), and conversing off list with a few people regarding a possible 
implementation of Set Returning Functions (or SRF for short). Below is 
my proposal for how this might work. After discussion, and if there is 
no objection, I would like to work on this implementation with the hope 
that it could be in place for 7.3.


Proposal for set returning functions (SRF):
-

The problem:
-
Currently the ability to return multiple row, multiple column result 
sets from a function is quite limited. In fact, it is not possible to 
return multiple columns directly. It is possible to work around this 
limitation, but only in a clumsy way (see contrib/dblink for an 
example). Alternatively refcursors may be used, but they have their own 
set of issues, not the least of which is they cannot be used in view 
definitions or exist outside of explicit transactions.


The feature:
-
The desired feature is the ability to return multiple row, multiple 
column result sets from a function, or set returning functions (SRF) for 
short.


Do we want this feature?
-
Based on the many posts on this topic, I think the answer to this is a 
resounding yes.


How do we want the feature to behave?
-
A SRF should behave similarly to any other table_ref (RangeTblEntry), 
i.e. as a tuple source in a FROM clause. Currently there are three 
primary kinds of RangeTblEntry: RTE_RELATION (ordinary relation), 
RTE_SUBQUERY (subquery in FROM), and RTE_JOIN (join). SRF would join 
this list and behave in much the same manner.


How do we want the feature implemented? (my proposal)
-
1. Add a new table_ref node type:
- Current nodes are RangeVar, RangeSubselect, or JoinExpr
- Add new RangePortal node as a possible table_ref. The RangePortal
  node will be extented from the current Portal functionality.

2. Add support for three modes of operation to RangePortal:
   a. Repeated calls -- this is the existing API for SRF, but
  implemented as a tuple source instead of as an expression.
   b. Materialized results -- use a TupleStore to materialize the
  result set.
   c. Return query -- use current Portal functionality, fetch entire
  result set.

3. Add support to allow the RangePortal to materialize modes 1 and 3, if 
needed for a re-read.

4. Add a WITH keyword to CREATE FUNCTION, allowing SRF mode to be 
specified. This would default to mode a) for backward compatibility.

5. Ignore the current code which allows functions to return multiple 
results as expressions; we can leave it there, but deprecate it with the 
intention of eventual removal.

-
Thoughts/comments would be much appreciated.

Thanks,

Joe


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Documentation on page files

2002-04-22 Thread Ross J. Reedstrom

On Sun, Apr 21, 2002 at 07:28:32PM +1000, Martijn van Oosterhout wrote:
> 
> http://svana.org/kleptog/pgsql/page.sgml.txt
> 
> I don't know whatever SGML format this is using, so the layout is not great,
> but the information should be accurate. I used it to create a program to
> dump the datafiles directly without the postmaster :).

Excellent - since this is a FRP (Frequently Requested Program) how do you
feel about dumping it in contrib? Even if it's hardcoded for your particular
table structure, it could serve as a starting point for some poor DBA
who's got to recover from a lost xlog, for example.

Ross

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Fixups on variable.c

2002-04-22 Thread Thomas Lockhart

I've committed a bit more to variable.c to handle integer inputs to GUC
parameters (string and float were already supported). I've included the
cvs log message below.

Further changes aren't precluded of course, but the code now supports
string, integer, and floating point inputs to parameters (for those
parameters which can accept them ;).

- Thomas

Convert GUC parameters back to strings if input as integers.
Change elog(ERROR) messages to say that a variable takes one parameter,
 rather than saying that it does not take multiple parameters.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [INTERFACES] sqlbang

2002-04-22 Thread .

Peter Eisentraut writes:
> [EMAIL PROTECTED] writes:
> > The most reason for patch are paremeters,
> 
> Parameters already exist:
> 
> peter ~$ cat test.sql
> \echo :x1
> \echo :x2
> peter ~$ pg-install/bin/psql -f test.sql -v x1=foo -v x2=bar
> foo
> bar
OK, positional parameters

-- 
@BABOLO  http://links.ru/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Schema (namespace) privilege details

2002-04-22 Thread Sander Steffann

Hi,

> > > > Maybe to keep hostile users from filling up your disk?
>
> Actually, I was serious, not sarcastic, about that "maybe." Like
> Tom, I'm not entirely sure that it's necessary to add this complexity,
> because there are so many other ways to abuse the system.

I know... But we have to start somewhere :)

> > I think Curt is right... If users are always allowed
> > to make temp tables, you can't give someone real read-only access to the
DB.
>
> Well, I'm not sure you can give "real" read-only access anyway.
> After all, if you've got a big enough table, all a user has to do
> is submit a few queries that sort the entire thing and you'll be
> eating up disk space like mad.

Ok. I forgot about that.

> But I think you can arrange for the
> sort files to go on another partition, to help limit the problems
> this would cause.

Sounds good.

> Another question is about the best place to put temporary tables.
> Right now they go in the database you're connected to, right? So
> it's possible for users that can create temporary tables to stop
> all inserts into that database by filling up its partition, but
> other DBs might be on different partitions and be unaffected.

At the moment all our DBs are on one partition. This would be a good reason
to split them, but it also makes it difficult if someone needs more space.

> Another way to go is to do what MS SQL server does, which is to
> put temp tables in a separate database. If you put that on its own
> partition, you can limit the damage users can do to the database
> that they're connected to, but then users can stop all other users
> from creating temporary tables.

That is true, but when I look at how many of our customers actually use temp
tables, I think this is not a very big problem (for us!)

> Personally, I feel the Postgres approach is better for postgres at
> this time, but there are other differences that help to make this
> so. In SQL Server, a "database" is really more a schema in the
> postgres sense, except that it's also a separate tablespace. So
> the two approaches are not directly comparable.
>
> In the end, it seems to me that there's only so much security you
> can implement in a database. I don't think that anybody produces
> a database server where I'd let random users connect directly,
> rather than going though an application that implements further
> security. Thus, one probably doesn't want to spend a lot of time
> trying to implement perfect security.

Only the idea of real read-only users seems useful to me. Maybe if temp
tables and big sorts could be limited this would be possible? Maybe a
restriction on CPU time... I don't know if there are any other places where
a user can eat resources, but the more I think about it, the more
complicated it gets. :-(

> Am I siding with you or Tom here? I'm not sure. :-)

I don't realy care, as long as we reach a good sollution! :-)

- Sander



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Implement a .NET Data Provider

2002-04-22 Thread Jean-Michel POURE

Le Dimanche 21 Avril 2002 02:11, Francisco Jr. a écrit :
> I'd like to know if there is already anybody working
> with something like this because I'm creating a new
> project at sourceforge.net and I don't want to overlap
> anywork already done :).

Maybe you should try contact the ODBC list which is mainly working on Windows 
features / connectivity. Also, the ODBC team might open a CVS account for you 
on Postgresql.org.

SourceForge does not allow projects to leave. Therefore, when your project is 
mature enough to be included in PostgreSQL main tree, there will still be 
garbage on Sourceforge in a Google search.

Cheers,
Jean-Michel POURE

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-22 Thread Luis Alberto Amigo Navarro

Hi All.
I've been reading all the thread and I want to add a few points:

You can set enable_seqscan=off in small or easy queries, but in large
queries index can speed parts of the query and slow other, so I think it is
neccesary if you want Postgres to become a Wide-used DBMS that the planner
could be able to decide accuratelly, in the thread there is a point that
might be useful, it will be very interesting that the planner could learn
with previous executions, even there could be a warm-up policy to let
planner learn about how the DB is working, this info could be stored with DB
data, and could statistically show how use of index or seqscan works on
every column of the DB.

I think it will be useful hearing all users and not guiding only with our
own experience, the main objective is to make a versatil DBMS, It's very
easy to get down the need of improving indexes with single selects, but a
lot of us are not doing single select, so I think that point needs to be
heard.
Regards


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html