date:20030909

Re: [HACKERS] Maximum table size

2003-09-09 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Is our maximum table size limited by the maximum block number?

Certainly.

 Is the 16TB number a hold-over from when we weren't sure block number
 was unsigned, though now we are pretty sure it is handled as unsigned
 consistenly?

It's a holdover.  As to how certain we are that all the
signed-vs-unsigned bugs are fixed, who have you heard from running a
greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
or even VACUUM FULL on it?  AFAIK we have zero field experience to
justify promising that it works.

We can surely fix any such bugs that get reported, but we haven't got
any infrastructure that would find or prevent 'em.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] pgsql in shared lib

2003-09-09 Thread Czuczy Gergely

hello

there is an alternate sql server, designed almost for your needs, it's
name is SQLite. it's a very simple one, but you can statically compile it
into your programs. search sf or freshmeat for it(or google, ofcourse)

Bye,

Gergely Czuczy
mailto: [EMAIL PROTECTED]
PGP: http://phoemix.harmless.hu/phoemix.pgp

The point is, that geeks are not necessarily the outcasts
society often believes they are. The fact is that society
isn't cool enough to be included in our activities.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Stats Collector Error 7.4beta1 and 7.4beta2

2003-09-09 Thread Kevin Brown

Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  This analysis makes sense - I think using memcmp is clearly wrong here.
 
 Yeah, now that I think about it, we're betting on the kernel to
 faithfully zero all unused bits in addrinfo structures.  In an ideal
 world, all kernels would do that, but in the real world it seems like
 a losing bet.

Yeah, I've always been under the impression that it's a bad idea in
general to memcmp() structs, if only because in doing so you make a
lot of implicit assumptions about the structs in question that aren't
necessarily true, especially when dealing with multiple architectures.

Makes me wonder if there are other parts of the code where we're
vulnerable to the same sort of issue...

 I could go for Jan's idea of putting a random key into the messages,
 if anyone feels that we should not trust to the kernel to enforce the
 packet source address restriction.  But the memcmp() test seems a clear
 loser given today's discussions.

The test in the 7.3.x code looked reasonable to me, especially if it's
possible to make it work with IPV6 (if it doesn't already).  It's doing
basically the right thing, at any rate: directly comparing the actual
fields that are relevant.  Does this test represent a significant
performance hit?




-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Maximum table size

2003-09-09 Thread scott.marlowe

On Tue, 9 Sep 2003, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Is our maximum table size limited by the maximum block number?
 
 Certainly.
 
  Is the 16TB number a hold-over from when we weren't sure block number
  was unsigned, though now we are pretty sure it is handled as unsigned
  consistenly?
 
 It's a holdover.  As to how certain we are that all the
 signed-vs-unsigned bugs are fixed, who have you heard from running a
 greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
 or even VACUUM FULL on it?  AFAIK we have zero field experience to
 justify promising that it works.
 
 We can surely fix any such bugs that get reported, but we haven't got
 any infrastructure that would find or prevent 'em.

any chance OSDL could test it?


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] mcxt.c

2003-09-09 Thread Andrew Dunstan

The particular assertion that was proposed doesn't strike me as terribly 
useful - It should be checked at the point of call rather than inside 
pstrdup, I should have thought.

Of course, that would make for lots of code bloat ... cases like this 
are when gdb is your friend.

cheers

andrew

Tom Lane wrote:

Greg Stark [EMAIL PROTECTED] writes:
 

Neil Conway [EMAIL PROTECTED] writes:
   

I think the percentage of deployments that enable assertions (which
causes a runtime performance hit) but NOT debugging info (which does
not) is pretty small.
 

 

How big a penalty is it? If it's small, or if it could be made small by making
a few assertions require an extra extra-assertions option, then perhaps it
would make more sense to ship with it enabled?
   

We generally don't recommend enabling assertions in production
installations, because it's not clear that there is any net gain in
stability from doing so.  Per the manual:
--enable-cassert

Enables assertion checks in the server, which test for many
can't happen conditions. This is invaluable for code
development purposes, but the tests slow things down a
little. Also, having the tests turned on won't necessarily
enhance the stability of your server! The assertion checks are
not categorized for severity, and so what might be a relatively
harmless bug will still lead to server restarts if it triggers
an assertion failure.  Currently, this option is not
recommended for production use, but you should have it on for
development work or when running a beta version.
Obviously this does not apply to cases where the assert is testing
for something that will cause a core dump anyway, like an improperly
NULL pointer.  But there are many, many asserts for things that are
probably not serious bugs (at worst they might deserve a FATAL exit,
rather than a system-wide PANIC).
Peter E. has speculated about improving the Assert facility to allow
categorization along this line, but I dunno when it will happen.
As far as your original question goes, I find that
MEMORY_CONTEXT_CHECKING and CLOBBER_FREED_MEMORY are quite expensive,
and presently --enable-cassert turns these on.  But of course we could
decouple that if we were going to encourage people to run with asserts
enabled in production.  I don't think asserts are hugely expensive
otherwise (though that might change if we sprinkle them as liberally
as Gaetano's proposal implies...)
			regards, tom lane

 



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Maximum table size

2003-09-09 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Is our maximum table size limited by the maximum block number?
 
 Certainly.
 
  Is the 16TB number a hold-over from when we weren't sure block number
  was unsigned, though now we are pretty sure it is handled as unsigned
  consistenly?
 
 It's a holdover.  As to how certain we are that all the
 signed-vs-unsigned bugs are fixed, who have you heard from running a
 greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
 or even VACUUM FULL on it?  AFAIK we have zero field experience to
 justify promising that it works.
 
 We can surely fix any such bugs that get reported, but we haven't got
 any infrastructure that would find or prevent 'em.

I guess the big question is what do we report as the maximum table size?
Do we report 32TB and fix any bug that happen over 16TB?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Maximum table size

2003-09-09 Thread Tatsuo Ishii

 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   Is our maximum table size limited by the maximum block number?
  
  Certainly.
  
   Is the 16TB number a hold-over from when we weren't sure block number
   was unsigned, though now we are pretty sure it is handled as unsigned
   consistenly?
  
  It's a holdover.  As to how certain we are that all the
  signed-vs-unsigned bugs are fixed, who have you heard from running a
  greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
  or even VACUUM FULL on it?  AFAIK we have zero field experience to
  justify promising that it works.
  
  We can surely fix any such bugs that get reported, but we haven't got
  any infrastructure that would find or prevent 'em.
 
 I guess the big question is what do we report as the maximum table size?
 Do we report 32TB and fix any bug that happen over 16TB?

That seems right direction for me. I see no reason why 16TB is more
reliable number than 32TB, since nobody has ever tried to build 16TB
tables.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] row level lock and table level locks

2003-09-09 Thread Jenny -

Well, then if i have a transaction1 that does the following:
begin work;
select * from students where age=19 for update;.
and then another transaction2 comes along and tries to lock the same row and 
is made to wait.
Does it find out the row hes trying to lock is already locked after it 
builds its own TupleTable and has access to the t_infomask (set to 
HEAP_MARKED_FOR_UPDATE for this tuple) in the HeapTupleHeader for the 
HeapTuple in question , since HeapTuples are stored in TupleTable.
thanks

From: Tom Lane [EMAIL PROTECTED]
To: Jenny - [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: Re: [HACKERS] row level lock and table level locks Date: Mon, 08 
Sep 2003 22:33:35 -0400

Jenny - [EMAIL PROTECTED] writes:
 TupleTables are just temporary data structures to hold transiently
 created tuples during execution of a query.  There's usually one for
 each plan node.
 The TupleTable will exist for the query from the point the query is made
 untill the transaction is committed? or does the TupleTable go away as 
soon
 as query is finished executing?

It goes away as soon as the query finishes.

My answer above was mistaken --- plan nodes usually allocate slots in a
single TupleTable created (and destroyed) by execMain.c, rather than
each having their own TupleTable.  But it's still a query-lifetime data
structure.
 I would think the TupleTable for that query is held untill the 
transaction
 is committed since lock on the tuple is endtill the end of transaction

You keep looking for nonexistent locks on tuples ...

The only resources represented by a TupleTable entry are memory for
a transient tuple (if we rewrote the system today, we'd forget that
function, since short-term memory contexts can do the job better)
or a buffer pin for a tuple that's sitting in a shared disk buffer.
There is no reason to hold a buffer pin beyond the time that the tuple
might actually be referenced by the query plan.
			regards, tom lane
_
Get 10MB of e-mail storage! Sign up for Hotmail Extra Storage.  
http://join.msn.com/?PAGE=features/es

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Maximum table size

2003-09-09 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I guess the big question is what do we report as the maximum table size?
 Do we report 32TB and fix any bug that happen over 16TB?

[shrug] I'm happy with what the docs say now.  I'd rather underpromise
than overpromise.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] mcxt.c

2003-09-09 Thread Gaetano Mendola

Andrew Dunstan [EMAIL PROTECTED] wrote:
 The particular assertion that was proposed doesn't strike me as terribly 
 useful - It should be checked at the point of call rather than inside 
 pstrdup, I should have thought.

Are you going to trust the client of that function ? 
Here the question is not if insert a check/assert there but write a general
rule if insert and where check/assert

Regards
Gaetano Mendola

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Maximum table size

2003-09-09 Thread Alvaro Herrera

On Tue, Sep 09, 2003 at 02:04:43AM -0400, Tom Lane wrote:

 It's a holdover.  As to how certain we are that all the
 signed-vs-unsigned bugs are fixed, who have you heard from running a
 greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
 or even VACUUM FULL on it?  AFAIK we have zero field experience to
 justify promising that it works.

BTW, I applied CLUSTER to a 1.6 GB tables a couple of days ago for the
first time and man did it take a long time.  The current code is
way too inefficient for rebuilding the table.  Maybe another approach
should be used.  I don't think clustering a 16 TB table is a serious
proposition.

-- 
Alvaro Herrera ([EMAIL PROTECTED])
Si no sabes adonde vas, es muy probable que acabes en otra parte.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] libpq++

2003-09-09 Thread luke









Someone know if already exist a libpq++ 

written in VC++ (6.x or above) ?!



Im trying libpq retrieved from CVS but with VC++ 
7.1 (VS 2003) doesnt work very well



Thanks



Regards

Luke

Re: [HACKERS] [PATCHES] mcxt.c

2003-09-09 Thread Alvaro Herrera Munoz

On Tue, Sep 09, 2003 at 04:53:06PM +0200, Gaetano Mendola wrote:
 Andrew Dunstan [EMAIL PROTECTED] wrote:
  The particular assertion that was proposed doesn't strike me as terribly 
  useful - It should be checked at the point of call rather than inside 
  pstrdup, I should have thought.
 
 Are you going to trust the client of that function ? 

Yes, because it can only used in backend code and C functions, which
can be written only by a trusted user ('cause C is an untrusted language).

(I might be wrong...)

-- 
Alvaro Herrera ([EMAIL PROTECTED])
Use it up, wear it out, make it do, or do without

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [PATCHES] mcxt.c

2003-09-09 Thread Andrew Dunstan

Alvaro Herrera Munoz wrote:

On Tue, Sep 09, 2003 at 04:53:06PM +0200, Gaetano Mendola wrote:
 

Andrew Dunstan [EMAIL PROTECTED] wrote:
   

The particular assertion that was proposed doesn't strike me as terribly 
useful - It should be checked at the point of call rather than inside 
pstrdup, I should have thought.
 

Are you going to trust the client of that function ? 
   

Yes, because it can only used in backend code and C functions, which
can be written only by a trusted user ('cause C is an untrusted language).
(I might be wrong...)

 

Besides that, trust isn't the issue, but rather what useful information 
can be gathered. How useful is it to know someone called pstrdup() with 
a null pointer? Not very, IMNSHO.

cheers

andrew

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Andrew Dunstan

Tom Lane wrote:

4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
will be accepted in exactly the same cases where they'd be accepted
in a boolean-requiring SQL construct (such as CASE).  (By default,
none are, so this isn't really different from #2.  But people could
create casts to boolean to override this behavior in a controlled
fashion.)
 

At this point I'm kinda leaning to #4, because (for example) people
could create a cast from integer to boolean to avoid having to fix their
plpgsql functions right away.  #3 would not offer any configurability of
behavior.
 

Won't people have to analyse their functions to find out what sort of 
casts they need to create? If so, why don't they just fix the functions 
while they are about it? Surely the fixes in most cases will be quite 
trivial, and in all cases backwards compatible.

Does anyone have a take on how many people would be affected? Or how 
much they would be affected?

cheers

andrew



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Stats Collector Error 7.4beta1 and 7.4beta2

2003-09-09 Thread Kurt Roeckx

On Tue, Sep 09, 2003 at 02:10:20AM -0700, Kevin Brown wrote:
  I could go for Jan's idea of putting a random key into the messages,
  if anyone feels that we should not trust to the kernel to enforce the
  packet source address restriction.  But the memcmp() test seems a clear
  loser given today's discussions.
 
 The test in the 7.3.x code looked reasonable to me, especially if it's
 possible to make it work with IPV6 (if it doesn't already).  It's doing
 basically the right thing, at any rate: directly comparing the actual
 fields that are relevant.  Does this test represent a significant
 performance hit?

The reason I used a memcmp() instead of dealing with the
structure members themself is because it was easier.

Checking that they're the same address family is easy, and if
they're different the kernel is really broken.

For the addresses and port, in case of IPv4, you have to cast it
to sockaddr_in *, and compare the sin_addr and sin_port like
before.
For IPv6 you could do it with a memcmp on the sin6_addr part, and
put it inside an #ifdef HAVE_IPV6.

If you want to write code to compare 2 addresses, please make it
a general function and place it in ip.c.

Anyway, I'm happy with the current use of recv().


Kurt


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Maximum table size

2003-09-09 Thread Bruce Momjian

Tatsuo Ishii wrote:
  Tom Lane wrote:
   Bruce Momjian [EMAIL PROTECTED] writes:
Is our maximum table size limited by the maximum block number?
   
   Certainly.
   
Is the 16TB number a hold-over from when we weren't sure block number
was unsigned, though now we are pretty sure it is handled as unsigned
consistenly?
   
   It's a holdover.  As to how certain we are that all the
   signed-vs-unsigned bugs are fixed, who have you heard from running a
   greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
   or even VACUUM FULL on it?  AFAIK we have zero field experience to
   justify promising that it works.
   
   We can surely fix any such bugs that get reported, but we haven't got
   any infrastructure that would find or prevent 'em.
  
  I guess the big question is what do we report as the maximum table size?
  Do we report 32TB and fix any bug that happen over 16TB?
 
 That seems right direction for me. I see no reason why 16TB is more
 reliable number than 32TB, since nobody has ever tried to build 16TB
 tables.

Agreed. I think the question is how large does the design support,
rather than how large have we tested.  (In fact, the check for using
block numbers as unsigned was removed from the FAQ when I reviewed the
code.)

I know Tom is concerned because we haven't tested it, but I don't think
anyone has tested 16TB either, nor our 1600-column limit.

Also, I think people look at these numbers to determine if PostgreSQL
can handle their data needs 5-10 years down the road.

In fact, if you increase the page size, you can quadruple most of the
existing limits.  This is already mentioned in the FAQ:

   PThe maximum table size and maximum number of columns can
   be increased if the default block size is increased to 32k./P

I have updated the FAQ to say 32TB, and of course, larger page sizes
could make this 128TB.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] libpq++

2003-09-09 Thread Jeroen T. Vermeulen

On Tue, Sep 09, 2003 at 01:57:47PM -0400, Bruce Momjian wrote:
 
 Sure libpq++ and libpqpp are on http://gborg.postgresql.org.

Ahem.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] libpq++

2003-09-09 Thread Bruce Momjian

Jeroen T. Vermeulen wrote:
 On Tue, Sep 09, 2003 at 01:57:47PM -0400, Bruce Momjian wrote:
  
  Sure libpq++ and libpqpp are on http://gborg.postgresql.org.

Uh, sorry, libpqpp and libpqxx.  libpqxx is the newer one.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Maximum table size

2003-09-09 Thread Gaetano Mendola

On Tue, 9 Sep 2003 14:25:19 -0400 (EDT), [EMAIL PROTECTED] (Bruce
Momjian) wrote:

Tatsuo Ishii wrote:
  Tom Lane wrote:
   Bruce Momjian [EMAIL PROTECTED] writes:
Is our maximum table size limited by the maximum block number?
   
   Certainly.
   
Is the 16TB number a hold-over from when we weren't sure block number
was unsigned, though now we are pretty sure it is handled as unsigned
consistenly?
   
   It's a holdover.  As to how certain we are that all the
   signed-vs-unsigned bugs are fixed, who have you heard from running a
   greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
   or even VACUUM FULL on it?  AFAIK we have zero field experience to
   justify promising that it works.
   
   We can surely fix any such bugs that get reported, but we haven't got
   any infrastructure that would find or prevent 'em.
  
  I guess the big question is what do we report as the maximum table size?
  Do we report 32TB and fix any bug that happen over 16TB?
 
 That seems right direction for me. I see no reason why 16TB is more
 reliable number than 32TB, since nobody has ever tried to build 16TB
 tables.

Agreed. I think the question is how large does the design support,
rather than how large have we tested.  (In fact, the check for using
block numbers as unsigned was removed from the FAQ when I reviewed the
code.)

I know Tom is concerned because we haven't tested it, but I don't think
anyone has tested 16TB either, nor our 1600-column limit.

Well, made some tests with 1600 shall not be so difficult and I'll not
bet that nobody reached this limit


Also, I think people look at these numbers to determine if PostgreSQL
can handle their data needs 5-10 years down the road.

I don't agree that people are looking at PostgreSQL fot handle 5-10
years old, what I think ( is anyway my opinion ) is that people are
looking at postgres in order to avoid more expensive tools like
ORACLE, SYBASE, INFORMIX, and have a low TCO


In fact, if you increase the page size, you can quadruple most of the
existing limits.  This is already mentioned in the FAQ:

   PThe maximum table size and maximum number of columns can
   be increased if the default block size is increased to 32k./P

I have updated the FAQ to say 32TB, and of course, larger page sizes
could make this 128TB.


Why this ? just because bigger is better? I agree with Tom Lane, is
better underpromise than overpromise.


Regards
Gaetano Mendola


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Maximum table size

2003-09-09 Thread Jeroen Ruigrok/asmodai

-On [20030909 20:32], Bruce Momjian ([EMAIL PROTECTED]) wrote:
I know Tom is concerned because we haven't tested it, but I don't think
anyone has tested 16TB either, nor our 1600-column limit.

If I had the space free on my SAN right now I'd try it.

The 1600 column limit should be easy to test on every system with some
scripts, no?

Also, I think people look at these numbers to determine if PostgreSQL
can handle their data needs 5-10 years down the road.

At work right now I have a bunch of 2-3 TB databases using Oracle 8.
We're expected to be using 60 TB in total storage about 2 years down the
road (right now we're using about 20).

I guess GIS databases and image databases might be the ones who would be
more concerned about these sort of limits in the near term future?

-- 
Jeroen Ruigrok van der Werven asmodai(at)wxs.nl / asmodai
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7  9D88 97E6 839B 2EAC 625B
http://www.tendra.org/   | http://www.in-nomine.org/~asmodai/diary/
From morning to night I stayed out of sight / Didn't recognise I'd become
No more than alive I'd barely survive / In a word, overrun...

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] pgsql in shared lib

2003-09-09 Thread ivan



Is true, but sometimes programers needgood database engine for simply
program.I think that postgres is one of the best sql db for free and with
open source, but its too much to install server form only one application,
on one workstation . So i thought that there could be the way out , to
build most simple version of postgres to lib(s), its also could be like
porting ??


 On Mon, 8 Sep 2003, Tom Lane wrote:

 Doug McNaught [EMAIL PROTECTED] writes:
  ivan [EMAIL PROTECTED] writes:
  ist possible to compile postgres (after same small modification) to shared
  so, or dll , and usr it like normal postgres , but without any server and
  so on.

  Not without very major code changes.

 ... which are unlikely to happen, given the development community's
 strong emphasis on reliability.  An embedded database is inherently less
 reliable than a client/server one, since any application bug has the
 potential to corrupt the database.  With client/server, at least we only
 have to worry about our own bugs ;-)

   regards, tom lane

 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Maximum table size

2003-09-09 Thread Paulo Scardine

From: Gaetano Mendola [EMAIL PROTECTED]
 Why this ? just because bigger is better? I agree with Tom Lane, is
 better underpromise than overpromise.

My $0.02:
You are talking about pg teoretical limits.
Why not add to the docs some information about the lack of resources 
for testing these limits and ask for donations?
Some kind of core-developers-hardware-wish-list???
If someone is going to handle this amount of data with Postgres, seems 
they can contribute with some hardware for the dev team.
Here in Brazil we have a say: who don't cry, don't milk.

Regards,
--
Paulo Scardine


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Maximum table size

2003-09-09 Thread Dann Corbit



 -Original Message-
 From: Jeroen Ruigrok/asmodai [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, September 09, 2003 1:23 PM
 To: Bruce Momjian
 Cc: Tatsuo Ishii; [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Maximum table size
 
 
 -On [20030909 20:32], Bruce Momjian ([EMAIL PROTECTED]) wrote:
 I know Tom is concerned because we haven't tested it, but I 
 don't think 
 anyone has tested 16TB either, nor our 1600-column limit.
 
 If I had the space free on my SAN right now I'd try it.
 
 The 1600 column limit should be easy to test on every system 
 with some scripts, no?
/*
** This will generate a 28 megabyte SQL script.
** 1600 table definitions will be created for tables
** with from 1 to 1600 columns.
*/
#include stdio.h

#define LIMIT 1600

int main(void)
{
int i,
j;
for (i = 1; i = LIMIT; i++) {
printf(CREATE TABLE FOO%04d (\n, i);
for (j = 1; j  i; j++) {
printf(\tA%04d varchar(256),\n, j);
}
printf(\tA%04d varchar(256)\n, j);
puts();\n);
}
return 0;
}
/*
Last table is:
CREATE TABLE FOO1600 (
A0001 varchar(256),
A0002 varchar(256),
A0003 varchar(256),
A0004 varchar(256),
A0005 varchar(256),
A0006 varchar(256),
A0007 varchar(256),
A0008 varchar(256),
A0009 varchar(256),
A0010 varchar(256),
A0011 varchar(256),
A0012 varchar(256),
A0013 varchar(256),
A0014 varchar(256),
A0015 varchar(256),
A0016 varchar(256),
A0017 varchar(256),
A0018 varchar(256),
A0019 varchar(256),
A0020 varchar(256),
A0021 varchar(256),
A0022 varchar(256),
A0023 varchar(256),
A0024 varchar(256),
A0025 varchar(256),
A0026 varchar(256),
A0027 varchar(256),
A0028 varchar(256),
A0029 varchar(256),
A0030 varchar(256),
A0031 varchar(256),
A0032 varchar(256),
A0033 varchar(256),
A0034 varchar(256),
A0035 varchar(256),
A0036 varchar(256),
A0037 varchar(256),
A0038 varchar(256),
A0039 varchar(256),
A0040 varchar(256),
A0041 varchar(256),
A0042 varchar(256),
A0043 varchar(256),
A0044 varchar(256),
A0045 varchar(256),
A0046 varchar(256),
A0047 varchar(256),
A0048 varchar(256),
A0049 varchar(256),
A0050 varchar(256),
A0051 varchar(256),
A0052 varchar(256),
A0053 varchar(256),
A0054 varchar(256),
A0055 varchar(256),
A0056 varchar(256),
A0057 varchar(256),
A0058 varchar(256),
A0059 varchar(256),
A0060 varchar(256),
A0061 varchar(256),
A0062 varchar(256),
A0063 varchar(256),
A0064 varchar(256),
A0065 varchar(256),
A0066 varchar(256),
A0067 varchar(256),
A0068 varchar(256),
A0069 varchar(256),
A0070 varchar(256),
A0071 varchar(256),
A0072 varchar(256),
A0073 varchar(256),
A0074 varchar(256),
A0075 varchar(256),
A0076 varchar(256),
A0077 varchar(256),
A0078 varchar(256),
A0079 varchar(256),
A0080 varchar(256),
A0081 varchar(256),
A0082 varchar(256),
A0083 varchar(256),
A0084 varchar(256),
A0085 varchar(256),
A0086 varchar(256),
A0087 varchar(256),
A0088 varchar(256),
A0089 varchar(256),
A0090 varchar(256),
A0091 varchar(256),
A0092 varchar(256),
A0093 varchar(256),
A0094 varchar(256),
A0095 varchar(256),
A0096 varchar(256),
A0097 varchar(256),
A0098 varchar(256),
A0099 varchar(256),
A0100 varchar(256),
A0101 varchar(256),
A0102 varchar(256),
A0103 varchar(256),
A0104 varchar(256),
A0105 varchar(256),
A0106 varchar(256),
A0107 varchar(256),
A0108 varchar(256),
A0109 varchar(256),
A0110 varchar(256),
A0111 varchar(256),
A0112 varchar(256),
A0113 varchar(256),
A0114 varchar(256),
A0115 varchar(256),
A0116 varchar(256),
A0117 varchar(256),
A0118 varchar(256),
A0119 varchar(256),
A0120 varchar(256),
A0121 varchar(256),
A0122 varchar(256),
A0123 varchar(256),
A0124 varchar(256),
A0125 varchar(256),
A0126 varchar(256),
A0127 varchar(256),
A0128 varchar(256),
A0129 varchar(256),
A0130 varchar(256),
A0131 varchar(256),
A0132 varchar(256),
A0133 varchar(256),
A0134 varchar(256),
A0135 varchar(256),
A0136 varchar(256),
A0137 varchar(256),
A0138

Re: [HACKERS] Maximum table size

2003-09-09 Thread Andrew Dunstan

Jeroen Ruigrok/asmodai wrote:

At work right now I have a bunch of 2-3 TB databases using Oracle 8.
We're expected to be using 60 TB in total storage about 2 years down the
road (right now we're using about 20).
I guess GIS databases and image databases might be the ones who would be
more concerned about these sort of limits in the near term future?
 

They must be very big images or there must be an awful lot of them :-)

Here's a recent sizing done on our image database project:

60,000 images, thumbnails and personal data are occupying about 1.4Gb 
of disk space in the database. So we can figure roughly 250Mb per 1000 
entries, or 25Gb per 1 million.

These are simple facial images, of about 7k each - thumbnails are about 
3k each.

cheers

andrew

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Maximum table size

2003-09-09 Thread Jeroen Ruigrok/asmodai

-On [20030909 23:02], Andrew Dunstan ([EMAIL PROTECTED]) wrote:
They must be very big images or there must be an awful lot of them :-)

*grin*

I was more thinking of organizations such as NASA and commercial
entities storing satellite images in databases.

-- 
Jeroen Ruigrok van der Werven asmodai(at)wxs.nl / asmodai
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7  9D88 97E6 839B 2EAC 625B
http://www.tendra.org/   | http://www.in-nomine.org/~asmodai/diary/
I dream of gardens in the desert sand...

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] 2-phase commit

2003-09-09 Thread Bruce Momjian


I haven't seen any comment on this email.

From our previous discussion of 2-phase commit, there was concern that
the failure modes of 2-phase commit were not solvable.  However, I think
multi-master replication is going to have similar non-solvable failure
modes, yet people still want multi-master replication.

We have had several requests for 2-phase commit in the past month.  I
think we should encourage the Japanese group to continue on their
2-phase commit patch to be included in 7.5.  Yes, it will have
non-solvable failure modes, but let's discuss them and find an
appropriate way to deal with the failures.

---

Andrew Sullivan wrote:
 Hi,
 
 As the 7.4 beta rolls on, I thought now would be a good time to start
 talking about the future.  
 
 I have a potential need in the future for distributed transactions
 (XA).  To get that from Postgres, I'd need two-phase commit, I think. 
 There is someone working on such a project
 (http://snaga.org/pgsql/), but last time it was discussed here, it
 received a rather lukewarm reception (see, e.g., the thread starting
 at
 http://archives.postgresql.org/pgsql-hackers/2003-06/msg00752.php).
 
 While at OSCON, I had a discussion with Joe Conway, Bruce Momjian,
 and Greg Sabino Mullane about 2PC.  Various people expressed various
 opinions on the topic, but I think we agreed on the following.  The
 relevant folks can correct me if I'm wrong:
 
 Two-phase commit has theoretical problems, but it is implemented in
 several enterprise RDBMS.  2PC is something needed by certain kinds
 of clients (especially those with transaction managers), so if
 PostgreSQL doesn't have it, PostgreSQL just won't get supported in
 that arena.  Someone is already working on 2PC, but may feel unwanted
 due to the reactions last heard on the topic, and may not continue
 working unless he gets some support.  What is a necessary condition
 for such support is to get some idea of what compromises 2PC might
 impose, and thereafter to try to determine which such compromises, if
 any, are acceptable ones.
 
 I think the idea here is that, while in most cases a pretty-good
 implementation of a desirable feature might get included in the
 source on the grounds that it can always be improved upon later,
 something like 2PC has the potential to do great harm to an otherwise
 reliable transaction manager.  So the arguments about what to do need
 to be aired in advance. 
 
 I (perhaps foolishly) volunteered to undertake to collect the
 arguments in various directions, on the grounds that I can contribute
 no code, but have skin made of asbestos.  I thought I'd try to
 collect some information about what people think the problems and
 potentially acceptable compromises are, to see if there is some way
 to understand what can and cannot be contemplated for 2PC.  I'll
 include in any such outline the remarks found in the -hackers thread
 referenced above.  Any objections?
 
 A
 
 -- 
 
 Andrew Sullivan 204-4141 Yonge Street
 Liberty RMS   Toronto, Ontario Canada
 [EMAIL PROTECTED]  M2P 2A8
  +1 416 646 3304 x110
 
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] massive quotes?

2003-09-09 Thread Bruce Momjian


I assume we never came to a final conclusion on how to do CREATE
FUNCTION without double-quoting.

---

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  In that case, though, the solution will presumably look at least a bit 
  different from those discussed so far in this thread. Or would you have 
  psql detect that in place of a string there was stdin or whatever and 
  then replace it with the inline string before passing it to the backend?
 
 Please see the archives.  I think that what was being discussed was
 something along the lines of
 
   foo= CREATE FUNCTION myfunc(...) RETURNS ... AS
   foo= \beginliteral
   foo' type my function definition here
   foo' and here
   foo' \endliteral
   foo- LANGUAGE plpgsql;
 
 and psql would proceed to quotify whatever you entered between
 the two backslash commands.  (Notice this could be used for any
 string-literal entry problem, not only CREATE FUNCTION.)  I'm fuzzy on
 the details though; this may not have been the best idea presented.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] 2-phase commit

2003-09-09 Thread Mike Mascari

Bruce Momjian wrote:
 I haven't seen any comment on this email.
 
 From our previous discussion of 2-phase commit, there was concern that
 the failure modes of 2-phase commit were not solvable.  However, I think
 multi-master replication is going to have similar non-solvable failure
 modes, yet people still want multi-master replication.
 
 We have had several requests for 2-phase commit in the past month.  I
 think we should encourage the Japanese group to continue on their
 2-phase commit patch to be included in 7.5.  Yes, it will have
 non-solvable failure modes, but let's discuss them and find an
 appropriate way to deal with the failures.

FWIW, Oracle 8's manual for the recovery of a distributed tx where the
coordinator never comes back on line is:

https://www.ifi.uni-klu.ac.at/Public/Documentation/oracle/product/8.0.3/doc/server803/A54643_01/ch_intro.htm#7783

If a database must be recovered to a point in the past, Oracle's
recovery facilities allow database administrators at other sites to
return their databases to the earlier point in time also. This ensures
that the global database remains consistent.

So it seems, for Oracle 8 at least, PITR is the method of recovery for
cohorts after unrecoverable coordinator failure.

Ugly and yet probably a prerequisite.

Mike Mascari
[EMAIL PROTECTED]








---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Philip Yarra

On Thu, 4 Sep 2003 05:36 am, Bruce Momjian wrote:
 I would like every operating system that supports thread-safety to run
 this program and report back the results.

Okay, here's results from the machines I have access to... I think what you're 
going to find is that an awful lot of platforms that do support pthreads do 
not necessarily provide thread-safe libc functions. 

Is it possible to identify which functions are likely to be called by multiple 
threads and create our own mutex-wrapper functions for them? Disabling 
thread-safety seems pretty drastic simply because of a lack of getpwuid_r() 
or thread-safe getpwuid(). Or do I misunderstand your concerns?

Regards, Philip.

$ uname -a
OSF1 hostname V4.0 1229 alpha
$ ./a.out 
Your getpwuid() changes the static memory area between calls
Your strerror() is _not_ thread-safe
Your functions are _not_ all thread-safe

There are older _r functions, but they're deprecated as the non _r are now 
thread-safe.

$ uname -a
SunOS hostname 5.6 Generic_105181-05 sun4m sparc SUNW,SPARCstation-4
$ gcc -lpthread -lnsl test.c # this works
$ ./a.out
Your gethostbyname() is _not_ thread-safe
Your getpwuid() is _not_ thread-safe
Your functions are _not_ all thread-safe

getpwduid_r provided
gethostbyname_r not provided

FreeBSD 5.1 (i386)
$ cc -pthread test.c
$ ./a.out
Your gethostbyname() is _not_ thread-safe
Your getpwuid() is _not_ thread-safe
Your functions are _not_ all thread-safe

manpage notes BUGS
 These functions use static data storage; if the data is needed for future
 use, it should be copied before any subsequent calls overwrite it.

FreeBSD 4.8 (i386)
$ cc -pthread test.c
$ ./a.out
Your gethostbyname() is _not_ thread-safe
Your getpwuid() is _not_ thread-safe
Your functions are _not_ all thread-safe

manpage notes BUGS
 These functions use static data storage; if the data is needed for future
 use, it should be copied before any subsequent calls overwrite it.

Linux 2.4.18-3 (i686)
$ ./a.out
Your gethostbyname() is _not_ thread-safe
Your getpwuid() is _not_ thread-safe
Your functions are _not_ all thread-safe

manpage notes The  functions  gethostbyname()  and gethostbyaddr() may return 
pointers to static data, which may be over-
   written by later calls. Copying the struct hostent does not suffice, 
since it contains pointers  -  a  deep
   copy is required.

Glibc2 also has reentrant versions gethostbyname_r() and gethostbyname2_r().  
These return 0 on success and
   nonzero  on  error.  The  result of the call is now stored in the 
struct with address ret.  After the call,
   *result will be NULL on error or point to the result on success.  
Auxiliary data is stored  in  the  buffer
   buf  of  length buflen.  (If the buffer is too small, these functions 
will return ERANGE.)  No global vari-
   able h_errno is modified, but the address of a variable in which  to  
store  error  numbers  is  passed  in
   h_errnop.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] massive quotes?

2003-09-09 Thread Andrew Dunstan

I think if it could be done in a reasonably aesthetic way in psql that 
would satisfy many people, without any need to disturb the backend, 
which Tom objects to.

That's a big if,  IMNSHO :-).

I'd hate to see this dropped, though

cheers

andrew

Bruce Momjian wrote:

I assume we never came to a final conclusion on how to do CREATE
FUNCTION without double-quoting.
---

Tom Lane wrote:
 

Andrew Dunstan [EMAIL PROTECTED] writes:
   

In that case, though, the solution will presumably look at least a bit 
different from those discussed so far in this thread. Or would you have 
psql detect that in place of a string there was stdin or whatever and 
then replace it with the inline string before passing it to the backend?
 

Please see the archives.  I think that what was being discussed was
something along the lines of
foo= CREATE FUNCTION myfunc(...) RETURNS ... AS
foo= \beginliteral
foo' type my function definition here
foo' and here
foo' \endliteral
foo- LANGUAGE plpgsql;
and psql would proceed to quotify whatever you entered between
the two backslash commands.  (Notice this could be used for any
string-literal entry problem, not only CREATE FUNCTION.)  I'm fuzzy on
the details though; this may not have been the best idea presented.
			regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend
   

 



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] 2-phase commit

2003-09-09 Thread Bruce Momjian

Mike Mascari wrote:
 Bruce Momjian wrote:
  I haven't seen any comment on this email.
  
  From our previous discussion of 2-phase commit, there was concern that
  the failure modes of 2-phase commit were not solvable.  However, I think
  multi-master replication is going to have similar non-solvable failure
  modes, yet people still want multi-master replication.
  
  We have had several requests for 2-phase commit in the past month.  I
  think we should encourage the Japanese group to continue on their
  2-phase commit patch to be included in 7.5.  Yes, it will have
  non-solvable failure modes, but let's discuss them and find an
  appropriate way to deal with the failures.
 
 FWIW, Oracle 8's manual for the recovery of a distributed tx where the
 coordinator never comes back on line is:
 
 https://www.ifi.uni-klu.ac.at/Public/Documentation/oracle/product/8.0.3/doc/server803/A54643_01/ch_intro.htm#7783
 
 If a database must be recovered to a point in the past, Oracle's
 recovery facilities allow database administrators at other sites to
 return their databases to the earlier point in time also. This ensures
 that the global database remains consistent.
 
 So it seems, for Oracle 8 at least, PITR is the method of recovery for
 cohorts after unrecoverable coordinator failure.

Yep, I assume PITR would be the solution for most failure cases --- very
ugly of course.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[HACKERS] Can I use PostgreSQL to develop a self-organizing database system?

2003-09-09 Thread Rada Chirkova

Hi,

I have asked my question on pgsql-general, and Tom Lane suggested I post
here too. I would really appreciate your opinion.

At NC State University, my students and I are working on a project called
self-organizing databases, please see description below. I would like to
use an open-source database system for implementation and would really
appreciate your opinion on whether PostgreSQL is suitable for the project.
In general, I am very impressed by the quality of PostgreSQL code and
documentation, as well as by the support of the developer community.

For the project, I need a cost-based query optimizer with exhaustive join
enumeration and use of statistics on stored relations; PostgreSQL has that.
I also need the ability to process SQL queries with aggregation, extensive
indexing capabilities, view mechanisms, and possibly integrity constraints;
it seems that PostgreSQL has all that. We will modify the query optimizer to
incorporate rewriting queries using views, and we will create
view-generating and view-manipulating modules.

Please let me know if you have comments.

Sincerely,

Rada Chirkova

==
Self-Organizing Databases

The goal of this project is to develop new effective methods to improve the
performance of sets of frequent and important queries on large relational
databases at all times, which could improve the efficiency of user
interactions with data-management systems. Solving the problem will have the
most effect in query optimization, data warehousing, and information
integration, which are important research topics with direct practical
applications.

The project focuses on the methodology of evaluating queries using views;
views are relations that are defined by auxiliary queries and can be used to
rewrite and answer user queries. One way to improve query performance is
precompute and store (i.e., materialize) views.

To truly optimize query performance, it is critical to materialize the
right views. The current focus of the project is on demonstrating that, by
designing and materializing views, it is possible to ensure optimal or
near-optimal performance of frequent and important queries, for common and
important query types. We consider this problem in the broader context of
designing self-organizing databases: A self-organizing database periodically
determines, without human intervention, a representative set of frequent and
important queries on the data, and incrementally designs and precomputes the
optimal (or near-optimal) views for that representative query workload. As
the representative query workload and the stored data change over time,
self-organizing databases adapt to the changes by changing the set of
materialized views that are used to improve the query-answering performance
in the database.

For building self-organizing databases, we consider an end-to-end solution 
that is, we consider all aspects of handling and using views, including:
·   designing and materializing views and indexes to improve query
performance;
·   exploring the effects of materialized views on the process of query
optimization;
·   adapting view design to the changing query workload, including the process
of retiring views that are no longer useful;
·   developing methods for auomatically updating existing materialized views
over time, to reflect the changes in the stored data;
·   developing methods to collect database statistics to reliably estimate the
sizes of the views the system considers for materialization;
·   analyzing the use of system resources and allocating an appropriate amount
of resources to view management in the system.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Temp tables and copy

2003-09-09 Thread Konstantin Goudkov


Hey guys, not sure if this is the right place to post.
Not even sure if this is a bug or a feature :)

When I create a temp table and then try to copy some data into the
table, if the data is corrupt and the synchronization is lost - the
table also seems to get lost.

For example
create temp table words(aa varchar(254));
copy words from stdin;
some stuff
more stuff
corrupt line\tline\tline
more\tstuff
and yet more stuff
\.

(replace ''\t'' with the real tabs to break the synchronization)

after that,

select * from words;
ERROR:  Relation words does not exist

Running:
 Welcome to psql 7.3.4, the PostgreSQL interactive terminal.
on
 FreeBSD dev1.idftech.com 4.6-RELEASE FreeBSD 4.6-RELEASE #2: Sun Jun  8 04:57:54 EDT 
 2003


Konstantin Goudkov
IDF Technologies, LLC.
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [osdldbt-general] Re: [HACKERS] Prelimiary DBT-2 Test results

2003-09-09 Thread Mary Edie Meredith

On Fri, 2003-09-05 at 15:16, Manfred Spraul wrote:
 Another question:
 Is it possible to apply patches to postgresql before a DBT-2 run, or is 
 only patching the kernel supported?
 
 --
 Manfred

As Mark indicated, we currently only support kernel patches via our PLM
system, but we are in the process of changing that for other components
(e.g. the compilers, the statistics tools).  Ultimately anything we can
save as source code, we will be able to patch and accessible via our
test platform, Scalable Test Platform (STP).

If we were to make it possible to download PostgreSQL releases on PLM
and allow developers to apply patches to it, would there be interest
from the community in that capability?   

In other words, would you all use it to test compiles on various
hardware (I32, I64, PowerPC), or test PostgreSQL on various Linux
kernels?  

Would you run the patches against the database test we have via STP?

Is there any other feature you might suggest?








-- 
Mary Edie Meredith [EMAIL PROTECTED]
Open Source Development Lab


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Bruce Momjian

Philip Yarra wrote:
 On Thu, 4 Sep 2003 05:36 am, Bruce Momjian wrote:
  I would like every operating system that supports thread-safety to run
  this program and report back the results.
 
 Okay, here's results from the machines I have access to... I think what you're 
 going to find is that an awful lot of platforms that do support pthreads do 
 not necessarily provide thread-safe libc functions. 

I see --- looks bad  failures below for OSF, Solaris, and FreeBSD
below.

 Is it possible to identify which functions are likely to be called by multiple 
 threads and create our own mutex-wrapper functions for them? Disabling 
 thread-safety seems pretty drastic simply because of a lack of getpwuid_r() 
 or thread-safe getpwuid(). Or do I misunderstand your concerns?

I am starting to think your approach is the only way to go --- I was
thinking of it for FreeBSD, but now, with these additional platforms, it
seems almost a requirement.

We would have to get some thread mutex, make the function call, copy the
return values into the passed pointer, and release the mutex?  Do we
test to see if we are in thread mode before doing the locking?  Is that
test even possible or desirable?

Seems we will need to rename the config variable to be
NON_REENTRANT_FUNC_NAMES_THREADSAFE, and add configure checks for each
*_r function, and fall back to the mutex if both settings are false.

This part has me concerned too:

 Copying the struct hostent does not suffice, since it contains
 pointers  -  a  deep copy is required.

Would someone with thread mutex experience assist me?


---

 
 Regards, Philip.
 
 $ uname -a
 OSF1 hostname V4.0 1229 alpha
 $ ./a.out 
 Your getpwuid() changes the static memory area between calls
 Your strerror() is _not_ thread-safe
 Your functions are _not_ all thread-safe
 
 There are older _r functions, but they're deprecated as the non _r are now 
 thread-safe.
 
 $ uname -a
 SunOS hostname 5.6 Generic_105181-05 sun4m sparc SUNW,SPARCstation-4
 $ gcc -lpthread -lnsl test.c # this works
 $ ./a.out
 Your gethostbyname() is _not_ thread-safe
 Your getpwuid() is _not_ thread-safe
 Your functions are _not_ all thread-safe
 
 getpwduid_r provided
 gethostbyname_r not provided
 
 FreeBSD 5.1 (i386)
 $ cc -pthread test.c
 $ ./a.out
 Your gethostbyname() is _not_ thread-safe
 Your getpwuid() is _not_ thread-safe
 Your functions are _not_ all thread-safe
 
 manpage notes BUGS
  These functions use static data storage; if the data is needed for future
  use, it should be copied before any subsequent calls overwrite it.
 
 FreeBSD 4.8 (i386)
 $ cc -pthread test.c
 $ ./a.out
 Your gethostbyname() is _not_ thread-safe
 Your getpwuid() is _not_ thread-safe
 Your functions are _not_ all thread-safe
 
 manpage notes BUGS
  These functions use static data storage; if the data is needed for future
  use, it should be copied before any subsequent calls overwrite it.
 
 Linux 2.4.18-3 (i686)
 $ ./a.out
 Your gethostbyname() is _not_ thread-safe
 Your getpwuid() is _not_ thread-safe
 Your functions are _not_ all thread-safe
 
 manpage notes The  functions  gethostbyname()  and gethostbyaddr() may return 
 pointers to static data, which may be over-
written by later calls. Copying the struct hostent does not suffice, 
 since it contains pointers  -  a  deep
copy is required.
 
 Glibc2 also has reentrant versions gethostbyname_r() and gethostbyname2_r().  
 These return 0 on success and
nonzero  on  error.  The  result of the call is now stored in the 
 struct with address ret.  After the call,
*result will be NULL on error or point to the result on success.  
 Auxiliary data is stored  in  the  buffer
buf  of  length buflen.  (If the buffer is too small, these functions 
 will return ERANGE.)  No global vari-
able h_errno is modified, but the address of a variable in which  to  
 store  error  numbers  is  passed  in
h_errnop.
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Temp tables and copy

2003-09-09 Thread Gavin Sherry

On Mon, 8 Sep 2003, Konstantin Goudkov wrote:

 
 Hey guys, not sure if this is the right place to post.
 Not even sure if this is a bug or a feature :)
 
 When I create a temp table and then try to copy some data into the
 table, if the data is corrupt and the synchronization is lost - the
 table also seems to get lost.

Temp tables only exist for the current session (connection). If there is a
problem with the copy and the connection gets reset, the session is
therefore ended and the temp table remove. As such it is a feature.

Gavin


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Can I use PostgreSQL to develop a self-organizing

2003-09-09 Thread Gavin Sherry

On Tue, 9 Sep 2003, Rada Chirkova wrote:

 Hi,
 
 I have asked my question on pgsql-general, and Tom Lane suggested I post
 here too. I would really appreciate your opinion.
 
 At NC State University, my students and I are working on a project called
 self-organizing databases, please see description below. I would like to
 use an open-source database system for implementation and would really
 appreciate your opinion on whether PostgreSQL is suitable for the project.
 In general, I am very impressed by the quality of PostgreSQL code and
 documentation, as well as by the support of the developer community.
 
 For the project, I need a cost-based query optimizer with exhaustive join
 enumeration and use of statistics on stored relations; PostgreSQL has that.
 I also need the ability to process SQL queries with aggregation, extensive
 indexing capabilities, view mechanisms, and possibly integrity constraints;
 it seems that PostgreSQL has all that. We will modify the query optimizer to
 incorporate rewriting queries using views, and we will create
 view-generating and view-manipulating modules.
 
 Please let me know if you have comments.

PostgreSQL does not, as yet, support materialised views. Since your
project aims to create materialised views automagically based on usage
patterns this doesn't seem like a problem.

Your project will probably need to modify the statistics collector code to
record performance of queries on different relations over time. There is
already code which uses these statistics to 'reorganise' the database, but
not in as sophisticated a way as you are looking at
(contrib/pg_autovacuum).

Be sure to keep the list up to date with what you are doing -- if you
proceed -- as this is an area which database vendors seem to be moving
(IBM seem to be making the most noise about it).

Thanks,

Gavin


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Temp tables and copy

2003-09-09 Thread Nico King

Note: forwarded message attached.
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software---BeginMessage---
hi
it seems we are sharing the same problem at different levels.
here is my original porblem which I posted couple of days ago:



Here is the problem I have some key tables that I needto import some data into it.I can't go ahead and write "insert into table value()"for over 40 differenttables and over 100s of rows and columnsThe reason that I have to write a script to enter thedata into the tables is that what if I have to enter1000 lines of data into 200 rows??here is a piece of my script that works but not when Ienter lets' say a char instead of integer.=copy accounts from stdin using delimiters ',';1,pass,mac,,,2,pass2,mac2,ip,test0,pass2,mac2,ip,test2\.===P.S: also I have used the tab delimiter.I have written a script to import some data intomy database tables, with the delimiter ','. Now myquestion is sometime the data being sent to my tablesmight not match the data type or be corrupted and Irecei
 ve an
 error message.One: how could I prevent that? Two: how can I proceed with importing the rest of thedata into the next record even though some arecorrupted,'cause I get intrupted as soon as there isan error in inserting the data?Konstantin Goudkov [EMAIL PROTECTED] wrote:
Hey guys, not sure if this is the right place to post.Not even sure if this is a bug or a feature :)When I create a temp table and then try to copy some data into thetable, if the data is corrupt and the synchronization is lost - thetable also seems to get lost.For examplecreate temp table words(aa varchar(254));copy words from stdin;some stuffmore stuffcorrupt line\tline\tlinemore\tstuffand yet more stuff\.(replace ''\t'' with the real tabs to break the synchronization)after that,select * from words;ERROR: Relation "words" does not existRunning: Welcome to psql 7.3.4, the PostgreSQL interactive terminal.on FreeBSD dev1.idftech.com 4.6-RELEASE FreeBSD 4.6-RELEASE #2: Sun Jun 8 04:57:54 EDT 2003Konstantin GoudkovIDF Technologies,
 LLC.[EMAIL PROTECTED]---(end of broadcast)---TIP 3: if posting/reading through Usenet, please send an appropriatesubscribe-nomail command to [EMAIL PROTECTED] so that yourmessage can get through to the mailing list cleanly
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software---End Message---

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Philip Yarra

On Wed, 10 Sep 2003 11:46 am, Bruce Momjian wrote:
 I see --- looks bad  failures below for OSF, Solaris, and FreeBSD
 below.

Actually, I am not sure the OSF failure is correctly reported... your test app 
had me a little baffled in that case.

 We would have to get some thread mutex, make the function call, copy the
 return values into the passed pointer, and release the mutex?  Do we
 test to see if we are in thread mode before doing the locking?  Is that
 test even possible or desirable?

I guess as with the threading stuff in ECPG:

#ifdef SOME_DEF (sorry, have to check the ECPG source on that one)
pthread_mutex_lock(my_mutex)
#endif

/* do stuff */

#ifdef SOME_DEF 
pthread_mutex_unlock(my_mutex)
#endif

 Seems we will need to rename the config variable to be
 NON_REENTRANT_FUNC_NAMES_THREADSAFE, and add configure checks for each
 *_r function, and fall back to the mutex if both settings are false.

Yeah, or you could just always use the wrapper and not try to do all the test 
in configure... doubtless less efficient, but maybe better for the mental 
health...

 This part has me concerned too:
  Copying the struct hostent does not suffice, since it contains
  pointers  -  a  deep copy is required.

 Would someone with thread mutex experience assist me?

Ummm... replace /* do stuff /* above with a deep copy of the hostent struct. 
I'll give that a shot if you like.

Regards, Philip.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Maximum table size

2003-09-09 Thread Gaetano Mendola

Dann Corbit [EMAIL PROTECTED] wrote:
 /*
 ** This will generate a 28 megabyte SQL script.
 ** 1600 table definitions will be created for tables
 ** with from 1 to 1600 columns.
 */

That's easy, now you shall do real query, real vacuum, real
reindex on it


Regards
Gaetano Mendola


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Philip Yarra

On Wed, 10 Sep 2003 12:29 pm, Bruce Momjian wrote:
 --- anyway, it is probably threadsafe, but strerror isn't, so we are
 dead anyway.  :-)

Oh, I see. Yep, good point. Strange that strerror isn't threadsafe when 
everything else is... maybe Strange is OSF's middle name.

  #ifdef SOME_DEF (sorry, have to check the ECPG source on that one)
  pthread_mutex_lock(my_mutex)
  #endif
 
  /* do stuff */
 
  #ifdef SOME_DEF
  pthread_mutex_unlock(my_mutex)
  #endif

 Yep.  Ugly but required.

Could be worse - at least creating a wrapper function keeps the 
aesthetically-offensive code away from most of the code, and everyone else 
could just call pg_gethostbyname() or whatever...

  Yeah, or you could just always use the wrapper and not try to do all the
  test in configure... doubtless less efficient, but maybe better for the
  mental health...

 True.  In fact, on platforms with non-*_r functions that are
 thread-safe, those locks are already done in libc anyway.  The problem
 is that on platforms that don't have non *_r thread-safe, and don't
 have all the *_r functions, we would be adding overhead to libpq that
 isn't already part of libc on that platform, and that seems wrong to me.

 Double-yuck.

No, correct me if I'm wrong, but the #ifdef'd code is removed by the 
pre-processor, so platforms without thread support would gain only the 
overhead of a single function call? That doesn't seem so steep. 

The actual copying of the structs wouldn't be needed in this case, so handle 
that like:

#ifdef SOME_DEF
/* copy structure and set return pointer to this copy /*
#else
/* set return pointer to global buffer */
#endif

It's only a penalty for platforms with thread-safe functions called within the 
mutex_locked section... and if we're talking about functions like 
gethostbyname() (which may well be making a network call to a DNS server) I 
doubt the second mutex_lock would be a noticeable penalty.

Making copies of structures is some penalty, that's true... I might try some 
timings to see how much of a penalty. Are these functions likely to see such 
heavy use that the additional times are a problem?

 We might have to produce a libpq_r and ecpg_r (yuck) on those platforms.

I beg you, stay away from this idea! Informix does this, and it isn't pretty. 
I have the core files to prove it.

  Ummm... replace /* do stuff /* above with a deep copy of the hostent
  struct. I'll give that a shot if you like.

 Tripple-yuck.  :-)

Hey, are you impugning my coding style? If so, you'll have to join the queue. 
:-)

Do you want me to have a try at the gethostbyname() wrappers, or is it going 
to be a waste of time?

Regards, Philip.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Tripple-yuck.  :-)

It doesn't seem to me that we should take on the job of providing
thread-safe implementations of basic libc functions.  If a particular
OS cannot manage to offer that functionality, then we should mark it
not-thread-safe and move on.  Persons unhappy with this labeling must
take it up with their OS developers, not us.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Is it a memory leak in PostgreSQL 7.4beta?

2003-09-09 Thread Bruce Momjian


Have we determined there _isn't_ a memory leak problem in beta2?

---

Tom Lane wrote:
 =?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes:
  The interesting thing was that my postmaster needed around 4mb of RAM 
  when I started running my test script using ...
  After about 2 1/2 hours the backend process already needed 11mb of ram. 
 
 Hmm.  I tried
 
 create table t_data (data int4, ts timestamp default now());
 
 followed by many repetitions of
 
 START TRANSACTION ISOLATION LEVEL READ COMMITTED;
 INSERT INTO t_data (data) VALUES ('2500');
 UPDATE t_data SET data = '2500' WHERE data = '2500';
 DELETE FROM t_data WHERE data = '2500';
 COMMIT;
 
 I am seeing a slow but steady growth of the backend process on a Linux
 box (RHL 8.0) --- top shows it growing a few K every few seconds.
 
 But I see *zero* growth with the same test on HPUX 10.20.
 
 A possible wild card is that the Postgres build I'm using on the Linux
 box is compiled for profiling (-pg, no --enable-debug or --enable-cassert)
 whereas the HPUX build has --enable-debug and --enable-cassert but no
 profiling.  I'm not aware that there's any known memory leakage in
 Linux' profiling support, though.
 
 Can anyone else reproduce this, or confirm they don't see it?  What
 platform, and what configure options?
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tripple-yuck.  :-)
 
 It doesn't seem to me that we should take on the job of providing
 thread-safe implementations of basic libc functions.  If a particular
 OS cannot manage to offer that functionality, then we should mark it
 not-thread-safe and move on.  Persons unhappy with this labeling must
 take it up with their OS developers, not us.

We do actually have a way to report OS thread failure to the user --- if
they ask for --enable-thread-safety, we just throw an error.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Is it a memory leak in PostgreSQL 7.4beta?

2003-09-09 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Have we determined there _isn't_ a memory leak problem in beta2?

I am not sure.  I have a suspicion that there is no real leak, but
rather we are seeing some artifact of the way Linux' top(1) reports
memory usage.  I cannot prove that --- I can only offer the evidence
that the exact same PG sources running the exact same queries on a
different OS (HPUX) show no memory leak.  It would be useful to hear
some more reports of the test case from people with other OSes.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] TCP/IP with 7.4 beta2 broken?

2003-09-09 Thread Bruce Momjian


Are all the IPv6 issues resolved in current CVS?

---

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  OK, now we are getting somewhere. I see that this would work. It's a bit 
  ugly, though - with this plan the sample file in both CVS and the 
  installation won't necessarily be what actually get put in place.
 
 Well, like I said, it's not real pretty.  But the same is already true
 of postgresql.conf.sample --- initdb edits that.  I don't see that
 having it edit pg_hba.conf.sample too is so bad.
 
  What if some clever installer/administrator deliberately alters their
  installed sample file?
 
 I don't think it would hurt them.  The editing will consist of a sed
 script to comment or uncomment the line containg ::1, it wouldn't touch
 anything else.
 
  Could we get the configure script to do it instead, since it too should 
  know about ip6 capability? (I guess then we'd have 
  pg_hba.conf.sample.in). That strikes me as being a lot cleaner.
 
 Bruce and I talked about that alternative too, but we felt that it made
 more sense to keep the processing of pg_hba.conf.sample parallel to what
 happens to postgresql.conf.sample.  Further down the road we might need
 initdb-time checks to decide what to do to the sample file, just as we
 already need for postgresql.conf.sample.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Philip Yarra

On Wed, 10 Sep 2003 02:15 pm, Bruce Momjian wrote:
 Tom Lane wrote:
  It doesn't seem to me that we should take on the job of providing
  thread-safe implementations of basic libc functions.  If a particular
  OS cannot manage to offer that functionality, then we should mark it
  not-thread-safe and move on.  

This would be a pretty short list unless I count wrong! This excludes all 
releases of FreeBSD (and I'm willing to bet other BSDs), Solaris (at least 
the old version I have), OSF, Linux, and who knows what else? MacOS X?

  Persons unhappy with this labeling must
  take it up with their OS developers, not us.

Surely the development of PostgreSQL has seen lots of platform shortcomings 
found and worked-around? Why not this as well?

Are these non-threadsafe functions really going to be so heavily-used that we 
can't live with the wrappers? I mean, AFAIK these threading issues are only 
in ECPG and libpq - it's not like re-writing the backend code is required.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Unixware Patch (Was: Re: Beta2 Tag'd and Bundled ...)

2003-09-09 Thread Tom Lane

Philip Yarra [EMAIL PROTECTED] writes:
 On Wed, 10 Sep 2003 02:15 pm, Bruce Momjian wrote:
 Tom Lane wrote:
 It doesn't seem to me that we should take on the job of providing
 thread-safe implementations of basic libc functions.  If a particular
 OS cannot manage to offer that functionality, then we should mark it
 not-thread-safe and move on.  

 This would be a pretty short list unless I count wrong!

If it's a short list, then it's a short list.

 Surely the development of PostgreSQL has seen lots of platform shortcomings 
 found and worked-around? Why not this as well?

Because we are not working in a vacuum.  A thread-safe implementation of
libpq is of zero value to an application unless it also has thread-safe
implementations of the other libraries it depends on.  When the
platform's libc has more thread-safety holes than the average block of
swiss cheese, there is no reason that I can see for us to expend effort
on workarounds that only fix libpq's usage.  Any app that might want to
use libpq is going to hit those same bugs, and so in the long run the
only useful answer is for the platform to fix its libc.

The real bottom line here is: who is going to try to build threaded
apps on platforms with un-thread-safe libc?  And why should we be the
ones to try to save them from suffering the pain they deserve?  We have
enough problems of our own to deal with...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Stats Collector Error 7.4beta1 and 7.4beta2

2003-09-09 Thread Jan Wieck



Kurt Roeckx wrote:

On Tue, Sep 09, 2003 at 02:10:20AM -0700, Kevin Brown wrote:
 I could go for Jan's idea of putting a random key into the messages,
 if anyone feels that we should not trust to the kernel to enforce the
 packet source address restriction.  But the memcmp() test seems a clear
 loser given today's discussions.
The test in the 7.3.x code looked reasonable to me, especially if it's
possible to make it work with IPV6 (if it doesn't already).  It's doing
basically the right thing, at any rate: directly comparing the actual
fields that are relevant.  Does this test represent a significant
performance hit?
The reason I used a memcmp() instead of dealing with the
structure members themself is because it was easier.
:-/

Checking that they're the same address family is easy, and if
they're different the kernel is really broken.
Agreed.

For the addresses and port, in case of IPv4, you have to cast it
to sockaddr_in *, and compare the sin_addr and sin_port like
before.
Using a decent C compiler (and who compiles PostgreSQL without) this 
should reduce to some sort of a 32-bit and another 16-bit comparisions 
... no?

For IPv6 you could do it with a memcmp on the sin6_addr part, and
put it inside an #ifdef HAVE_IPV6.
If you want to write code to compare 2 addresses, please make it
a general function and place it in ip.c.
Anyway, I'm happy with the current use of recv().
I disagree here. The clean and secure way is still to do *some* check 
(because we carefully don't assume that all OS's behave like some 
manpages happen to agree to). The performant way is to do it with 
information that is available without any extra system call.

So either we do the random signature thing, which I would favor as a one 
time be all, end all solution - or you do the actual from-address based 
implementation by restoring the old IPV4 behaviour and adding correct 
IPV6 behaviour.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [SQL] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Bruce Momjian

Tom Lane wrote:
 Manfred Koizar [EMAIL PROTECTED] writes:
  On Mon, 08 Sep 2003 11:40:32 -0400, Tom Lane [EMAIL PROTECTED]
  wrote:
  4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
  will be accepted in exactly the same cases where they'd be accepted
  in a boolean-requiring SQL construct (such as CASE).  (By default,
  none are, so this isn't really different from #2.  But people could
  create casts to boolean to override this behavior in a controlled
  fashion.)
 
  I vote for 4.
 
 I'm willing to do that.

OK, what release should we do this?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [SQL] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Manfred Koizar

On Mon, 08 Sep 2003 11:40:32 -0400, Tom Lane [EMAIL PROTECTED]
wrote:
4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
   will be accepted in exactly the same cases where they'd be accepted
   in a boolean-requiring SQL construct (such as CASE).  (By default,
   none are, so this isn't really different from #2.  But people could
   create casts to boolean to override this behavior in a controlled
   fashion.)

I vote for 4.  And - being fully aware of similar proposals having
failed miserably - I propose to proceed as follows:

If the current behaviour is considered a bug, let i=4, else let i=5.

In 7.i:  Create a new GUC variable plpgsql_strict_boolean (silly
name, I know) in the VERSION/PLATFORM COMPATIBILITY section of
postgresql.conf.  Make the new behaviour dependent on this variable.
Default plpgsql_strict_boolean to false.  Place a warning into the
release notes and maybe into the plpgsql documentation.

In 7.j, ji:  Change the default value of plpgsql_strict_boolean to
true.  Issue WARNINGs or NOTICEs as appropriate.  Update
documentation.

In 7.k, kj:  Remove old behaviour and GUC variable.  Update
documentation.

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] [SQL] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Tom Lane

Manfred Koizar [EMAIL PROTECTED] writes:
 On Mon, 08 Sep 2003 11:40:32 -0400, Tom Lane [EMAIL PROTECTED]
 wrote:
 4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
 will be accepted in exactly the same cases where they'd be accepted
 in a boolean-requiring SQL construct (such as CASE).  (By default,
 none are, so this isn't really different from #2.  But people could
 create casts to boolean to override this behavior in a controlled
 fashion.)

 I vote for 4.

I'm willing to do that.

 And - being fully aware of similar proposals having
 failed miserably - I propose to proceed as follows:

 If the current behaviour is considered a bug, let i=4, else let i=5.

 In 7.i:  Create a new GUC variable plpgsql_strict_boolean (silly
 name, I know) in the VERSION/PLATFORM COMPATIBILITY section of
 postgresql.conf.  Make the new behaviour dependent on this variable.
 Default plpgsql_strict_boolean to false.  Place a warning into the
 release notes and maybe into the plpgsql documentation.

 In 7.j, ji:  Change the default value of plpgsql_strict_boolean to
 true.  Issue WARNINGs or NOTICEs as appropriate.  Update
 documentation.

 In 7.k, kj:  Remove old behaviour and GUC variable.  Update
 documentation.

I'm not willing to do that much work for what is, in the greater scheme
of things, a tiny change.  If we did that for every user-visible change,
our rate of forward progress would be a mere fraction of what it is.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] [SQL] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Richard Hall



Define the language! If it breaks code, so be it.
2. Throw an error if the _expression_ doesn't return boolean.
Yes, yes, absolutely.
By definition "an IF, WHILE, or EXIT statement is a boolean _expression_"
SO
 if "some stupid piece of text" THEN
should not compile, there is no BOOLEAN _expression_.
C's implementation of hat is true and false has always, IMHO, been hideous.
But then again, I am a Pascal kind of thinker.
An integer with a value of 1 is still only an integer,
 IF I > 0 THEN ...
is clear and un-ambiguous.


Tom Lane wrote:
Following up this gripe
http://archives.postgresql.org/pgsql-sql/2003-09/msg00044.php
I've realized that plpgsql just assumes that the test _expression_
of an IF, WHILE, or EXIT statement is a boolean _expression_. It
doesn't take any measures to ensure this is the case or convert
the value if it's not the case. This seems pretty bogus to me.
However ... with the code as it stands, for pass-by-reference datatypes
any nonnull value will appear TRUE, while for pass-by-value datatypes
any nonzero value will appear TRUE. I fear that people may actually
be
depending on these behaviors, particularly the latter one which is
pretty reasonable if you're accustomed to C. So while I'd like
to throw
an error if the argument isn't boolean, I'm afraid of breaking people's
function definitions.
Here are some possible responses, roughly in order of difficulty
to implement:
1. Leave well enough alone (and perhaps document the behavior).
2. Throw an error if the _expression_ doesn't return boolean.
3. Try to convert nonbooleans to boolean using plpgsql's usual method
 for cross-type coercion, ie run the type's output proc
to get a
 string and feed it to bool's input proc. (This seems
unlikely to
 avoid throwing an error in very many cases, but it'd be
the most
 consistent with other parts of plpgsql.)
4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
 will be accepted in exactly the same cases where they'd
be accepted
 in a boolean-requiring SQL construct (such as CASE).
(By default,
 none are, so this isn't really different from #2.
But people could
 create casts to boolean to override this behavior in a
controlled
 fashion.)
Any opinions about what to do?

regards, tom lane
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED]
so that your
 message can get through to the mailing
list cleanly

Re: [HACKERS] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Jan Wieck



Tom Lane wrote:

Following up this gripe
http://archives.postgresql.org/pgsql-sql/2003-09/msg00044.php
I've realized that plpgsql just assumes that the test expression
of an IF, WHILE, or EXIT statement is a boolean expression.  It
doesn't take any measures to ensure this is the case or convert
the value if it's not the case.  This seems pretty bogus to me.
However ... with the code as it stands, for pass-by-reference datatypes
any nonnull value will appear TRUE, while for pass-by-value datatypes
any nonzero value will appear TRUE.  I fear that people may actually be
depending on these behaviors, particularly the latter one which is
pretty reasonable if you're accustomed to C.  So while I'd like to throw
an error if the argument isn't boolean, I'm afraid of breaking people's
function definitions.
Here are some possible responses, roughly in order of difficulty
to implement:
1. Leave well enough alone (and perhaps document the behavior).

2. Throw an error if the expression doesn't return boolean.
ERROR is the cleanest way, but I'd vote for conversion to boolean to 
keep the damage within reason.

Jan

3. Try to convert nonbooleans to boolean using plpgsql's usual method
   for cross-type coercion, ie run the type's output proc to get a
   string and feed it to bool's input proc.  (This seems unlikely to
   avoid throwing an error in very many cases, but it'd be the most
   consistent with other parts of plpgsql.)
4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
   will be accepted in exactly the same cases where they'd be accepted
   in a boolean-requiring SQL construct (such as CASE).  (By default,
   none are, so this isn't really different from #2.  But people could
   create casts to boolean to override this behavior in a controlled
   fashion.)
Any opinions about what to do?

			regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 ERROR is the cleanest way, but I'd vote for conversion to boolean to 
 keep the damage within reason.

Which style of conversion did you like?  These were the choices:

 3. Try to convert nonbooleans to boolean using plpgsql's usual method
 for cross-type coercion, ie run the type's output proc to get a
 string and feed it to bool's input proc.  (This seems unlikely to
 avoid throwing an error in very many cases, but it'd be the most
 consistent with other parts of plpgsql.)
 
 4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
 will be accepted in exactly the same cases where they'd be accepted
 in a boolean-requiring SQL construct (such as CASE).  (By default,
 none are, so this isn't really different from #2.  But people could
 create casts to boolean to override this behavior in a controlled
 fashion.)

At this point I'm kinda leaning to #4, because (for example) people
could create a cast from integer to boolean to avoid having to fix their
plpgsql functions right away.  #3 would not offer any configurability of
behavior.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [SQL] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread R. van Twisk

I would suggest to throw a error, or at least a warning.

This will FORCE people to program in the correct way.

I also thought that 'IF $1 THEN ...' should work ok but giving it a other
thought it's indeed stuped to write that way (I'm from the C world...)

Ries

-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Tom Lane
Verzonden: maandag 8 september 2003 17:41
Aan: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Onderwerp: [SQL] plpgsql doesn't coerce boolean expressions to boolean


Following up this gripe
http://archives.postgresql.org/pgsql-sql/2003-09/msg00044.php
I've realized that plpgsql just assumes that the test expression
of an IF, WHILE, or EXIT statement is a boolean expression.  It
doesn't take any measures to ensure this is the case or convert
the value if it's not the case.  This seems pretty bogus to me.

However ... with the code as it stands, for pass-by-reference datatypes
any nonnull value will appear TRUE, while for pass-by-value datatypes
any nonzero value will appear TRUE.  I fear that people may actually be
depending on these behaviors, particularly the latter one which is
pretty reasonable if you're accustomed to C.  So while I'd like to throw
an error if the argument isn't boolean, I'm afraid of breaking people's
function definitions.

Here are some possible responses, roughly in order of difficulty
to implement:

1. Leave well enough alone (and perhaps document the behavior).

2. Throw an error if the expression doesn't return boolean.

3. Try to convert nonbooleans to boolean using plpgsql's usual method
   for cross-type coercion, ie run the type's output proc to get a
   string and feed it to bool's input proc.  (This seems unlikely to
   avoid throwing an error in very many cases, but it'd be the most
   consistent with other parts of plpgsql.)

4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
   will be accepted in exactly the same cases where they'd be accepted
   in a boolean-requiring SQL construct (such as CASE).  (By default,
   none are, so this isn't really different from #2.  But people could
   create casts to boolean to override this behavior in a controlled
   fashion.)

Any opinions about what to do?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] plpgsql doesn't coerce boolean expressions to boolean

2003-09-09 Thread Jan Wieck



Tom Lane wrote:
Jan Wieck [EMAIL PROTECTED] writes:
ERROR is the cleanest way, but I'd vote for conversion to boolean to 
keep the damage within reason.
Which style of conversion did you like?  These were the choices:

3. Try to convert nonbooleans to boolean using plpgsql's usual method
for cross-type coercion, ie run the type's output proc to get a
string and feed it to bool's input proc.  (This seems unlikely to
avoid throwing an error in very many cases, but it'd be the most
consistent with other parts of plpgsql.)
4. Use the parser's coerce_to_boolean procedure, so that nonbooleans
will be accepted in exactly the same cases where they'd be accepted
in a boolean-requiring SQL construct (such as CASE).  (By default,
none are, so this isn't really different from #2.  But people could
create casts to boolean to override this behavior in a controlled
fashion.)
At this point I'm kinda leaning to #4, because (for example) people
could create a cast from integer to boolean to avoid having to fix their
plpgsql functions right away.  #3 would not offer any configurability of
behavior.
Agreed - #4.

Thinking of the problem about deprication of features and transition 
time, it would be nice for this kind of compatibility breaking changes 
to have a _per database_ config option that controls old vs. new 
behaviour, wouldn't it? Don't know exactly how you'd like that to be. 
Maybe with a pg_config catalog that inherits default settings from 
template1 but can then be changed in every database. This would even 
include the possibility to *switch* one single prod database back to the 
old behaviour in case the supposedly cleaned up application isn't as 
clean as supposed to.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

58 matches

Mail list logo