Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am

2002-07-05 Thread Marc G. Fournier

On Fri, 5 Jul 2002, Curt Sampson wrote:


 While there are big changes between 7.2 and the next release, they
 aren't really any bigger than others during the 7.x series. I don't
 really feel that the next release is worth an 8.0 rather than a 7.3. But
 this is just an opinion; it's not something I'm prepared to argue about.

Actually, the big change is such that will, at least as far as I'm
understanding it, break pretty much every front-end applicaiton ... which,
I'm guessing, is pretty major, no? :)





---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am

2002-07-05 Thread Marc G. Fournier

On Fri, 5 Jul 2002, Tom Lane wrote:

 Marc G. Fournier [EMAIL PROTECTED] writes:
  Actually, the big change is such that will, at least as far as I'm
  understanding it, break pretty much every front-end applicaiton ...

 Only those that inspect system catalogs --- I'm not sure what percentage
 that is, but surely it's not pretty much every one.  psql for example
 is only affected because of its \d commands.

Okay, anyone have any ideas of other packages that would inspect the
system catalog?  The only ones I could think of, off the top of my head,
would be pgAccess, pgAdmin and phpPgAdmin ... but I would guess that any
'administratively oriented' interface would face similar problems, no?






---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html





Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am being interviewed by OReilly )

2002-07-05 Thread Tom Lane

Marc G. Fournier [EMAIL PROTECTED] writes:
 Actually, the big change is such that will, at least as far as I'm
 understanding it, break pretty much every front-end applicaiton ...

Only those that inspect system catalogs --- I'm not sure what percentage
that is, but surely it's not pretty much every one.  psql for example
is only affected because of its \d commands.

regards, tom lane



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])





Re: [HACKERS] Proposal: CREATE CONVERSION

2002-07-05 Thread Tom Lane

Tatsuo Ishii [EMAIL PROTECTED] writes:
 Syntax proposal:
 CREATE CONVERSION conversion name
SOURCE source encoding name
DESTINATION destination encoding name
FROM conversion function name

Doesn't a conversion currently require several support functions?
How much overhead will you be adding to funnel them all through
one function?

Basically I'd like to see a spec for the API of the conversion
function...

Also, is there anything in SQL99 that we ought to try to be
compatible with?

regards, tom lane



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





Re: [HACKERS] Proposal: CREATE CONVERSION

2002-07-05 Thread Tom Lane

Tatsuo Ishii [EMAIL PROTECTED] writes:
 Doesn't a conversion currently require several support functions?
 How much overhead will you be adding to funnel them all through
 one function?

 No, only one function is sufficient. What else do you think of?

I see two different functions linked to from each pg_wchar_table
entry... although perhaps those are associated with encodings
not with conversions.

 Basically I'd like to see a spec for the API of the conversion
 function...

 That would be very simple (the previous example I gave was unnecessary
 complex). The function signature would look like:
 conversion_funcion(TEXT) RETURNS TEXT
 It receives source text and converts it then returns it. That's all.

IIRC the existing conversion functions deal in C string pointers and
lengths.  I'm a little worried about the extra overhead implicit
in converting to a TEXT object and back again; that probably means at
least two more palloc and memcpy operations.  I think you'd be better
off sticking to a C-level API, because I really don't believe that
anyone is going to code conversion functions in (say) plpgsql.

regards, tom lane



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: [HACKERS] Proposal: CREATE CONVERSION

2002-07-05 Thread Tatsuo Ishii

  CREATE CONVERSION conversion name
 SOURCE source encoding name
 DESTINATION destination encoding name
 FROM conversion function name
 
 Doesn't a conversion currently require several support functions?
 How much overhead will you be adding to funnel them all through
 one function?

No, only one function is sufficient. What else do you think of?

 Basically I'd like to see a spec for the API of the conversion
 function...

That would be very simple (the previous example I gave was unnecessary
complex). The function signature would look like:

conversion_funcion(TEXT) RETURNS TEXT

It receives source text and converts it then returns it. That's all.

 Also, is there anything in SQL99 that we ought to try to be
 compatible with?

As far as I know there's no such an equivalent in SQL99.
--
Tatsuo Ishii



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





[HACKERS] Typo in htup.h comment

2002-07-05 Thread Manfred Koizar

Fix typo in xl_heaptid comment

Servus
 Manfred

--- ../base/src/include/access/htup.h   2002-07-04 18:05:04.0 +0200
+++ src/include/access/htup.h   2002-07-05 16:52:44.0 +0200
@@ -268,15 +268,15 @@
 /*
  * When we insert 1st item on new page in INSERT/UPDATE
  * we can (and we do) restore entire page in redo
  */
 #define XLOG_HEAP_INIT_PAGE 0x80
 
 /*
- * All what we need to find changed tuple (18 bytes)
+ * All what we need to find changed tuple (14 bytes)
  *
  * NB: on most machines, sizeof(xl_heaptid) will include some trailing pad
  * bytes for alignment.  We don't want to store the pad space in the XLOG,
  * so use SizeOfHeapTid for space calculations.  Similar comments apply for
  * the other xl_FOO structs.
  */
 typedef struct xl_heaptid
 {
RelFileNode node;
ItemPointerData tid;/* changed tuple id */
 } xl_heaptid;

Servus
 Manfred



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





Re: [HACKERS] (A) native Windows port

2002-07-05 Thread Andrew Sullivan

On Fri, Jul 05, 2002 at 12:39:13PM -0400, Lamar Owen wrote:

 One other usability note: why can't postmaster perform the steps of
 an initdb if -D points to an empty directory?  It's not that much
 code, is it?  (I know that one extra step isn't backbreaking, but
 I'm looking at this from a rank newbie's point of view -- or at
 least I'm trying to look at it in that way, as it's been a while
 since I was a rank newbie at PostgreSQL) Oh well, just a random
 thought.

Rank newbies shouldn't be protected in this way, partly because if
something goes wrong, _they won't know what to do_.  Please, please,
don't be putting automagic, database destroying functions like that
into the postmaster.  It's a sure way to cause a disaster at aome
point.

A

-- 

Andrew Sullivan   87 Mowat Avenue 
Liberty RMS   Toronto, Ontario Canada
[EMAIL PROTECTED]  M6K 3E3
 +1 416 646 3304 x110




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html





[HACKERS] Page type and version

2002-07-05 Thread Manfred Koizar

As the upcoming release is breaking compatibility anyway:  what do you
think about placing a magic number and some format version info into
the page header?

One 32-bit-number per page should be enough to encode page type and
version.  We have just to decide, how we want it:

a) combine page type and version into a single 32-bit magic number

HEAPPAGE73 = 0x63c490c9
HEAPPAGE74 = 0x540beeb3
...
BTREE73= 0x8cdc8edb
BTREE74= 0xbb13f0a1

b) use n bits for the page type and the rest for a version number

HEAPPAGE73 = 0x63c40703
HEAPPAGE74 = 0x63c40704
...
BTREE73= 0x8cdc0703
BTREE74= 0x8cdc0704

The latter has the advantage, that the software could easily check for
a version range (e.g. if (PageGetVersion(page) = 0x0703) ...).

One might argue, that one magic number *per file* should be
sufficient.  That would mean, that the first page of a file had to
have a different format.  Btree has such a meta page;  I don't know
about the other access methods.

With a magic number in every single page it could even be possible to
do a smooth upgrade:  Just install Postgres 8.0 and continue to use
your PostgreSQL 7.4 databases :-).  Whenever the backend reads an old
format page it uses alternative accessor routines.  New pages are
written in the new format.  Or the database can be run in
compatibility mode ...  I'm dreaming ...

Thoughts?

Servus
 Manfred



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])





Re: [HACKERS] Proposal: CREATE CONVERSION

2002-07-05 Thread Tatsuo Ishii

 I see two different functions linked to from each pg_wchar_table
 entry... although perhaps those are associated with encodings
 not with conversions.

Yes. those are not directly associated with conversions.

 IIRC the existing conversion functions deal in C string pointers and
 lengths.  I'm a little worried about the extra overhead implicit
 in converting to a TEXT object and back again; that probably means at
 least two more palloc and memcpy operations.  I think you'd be better
 off sticking to a C-level API, because I really don't believe that
 anyone is going to code conversion functions in (say) plpgsql.

I am worried about that too. But if we stick a C-level API, how can we
define the argument data type suitable for C string? I don't see such
data types. Maybe you are suggesting that we should not use CREATE
FUNCTION?
--
Tatsuo Ishii



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





Re: [HACKERS] Issues Outstanding for Point In Time Recovery (PITR)

2002-07-05 Thread Bruce Momjian

Bruce Momjian wrote:
 You are saying, How do we know what WAL records go with that backup
 snapshot of the file?  OK, lets assume we are shutdown.  You can grab
 the WAL log info from pg_control using contrib/pg_controldata and that
 tells you what WAL logs to roll forward when you need to PIT recover
 that backup later.  If you store that info in the first file you backup,
 you can have that WAL pointer available for later recovery in case you
 are restoring from that backup.  Is that the issue?
 
 What seems more complicated is doing the backup while the database is
 active, and this may be a requirement for a final PITR solution.  Some
 think we can grab the WAL pointer at 'tar' start and replay that on the
 backup even if the file changes during backup.

OK, I think I understand live backups now using tar and PITR.  Someone
explained this to me months ago but now I understand it.

First, a key issue is that PostgreSQL doesn't fiddle with individual
items on disk.  It reads an 8k block, modifies it, (writes it to WAL if
it hasn't been written to that WAL segment before), and writes it to
disk.  That is key.  (Are there cases where don't do this, like
pg_controldata?)

OK, so you do a tar backup of a file.  While you are doing the tar,
certain 8k blocks are being modified in the file.  There is no way to
know what blocks are modified as you are doing the tar, and in fact you
could read partial page writes during the tar.

One solution would be to read the file using the PostgreSQL page buffer,
but even then, getting a stable snapshot of the file would be difficult.
Now, we could lock the table and prevent writes while it is being backed
up, but there is a better way.

We already have pre-change page images in WAL.  When we do the backup,
any page that was modified while we were backing up is in the WAL.  On
restore, we can recover whatever tar saw of the file, knowing that the
WAL page images will recover any page changes made during the tar.

Now, you mentioned we may not want pre-change page images in WAL
because, with PITR, we can more easily recover from the WAL rather than
having this performance hit for many page writes.

What I suggest is a way for the backup tar to turn on pre-change page
images while the tar is happening, and turn it off after the tar is
done.

We already have this TODO item:

* Turn off after-change writes if fsync is disabled (?)

No sense in doing after-change WAL writes without fsync.  We could
extend this so those after-changes writes could be turned on an off,
allowing fill tar backups and PITR recovery.  In fact, for people with
reliable hardware, we should already be giving them the option of
turning off pre-change writes.  We don't have a way of detecting partial
page writes, but then again, we can't detect failures with fsync off
anyway so it seems to be the same vulnerability.  I guess that's why we
were going to wrap the effect into the same variable, but for PITR, can
see wanting fsync always on and the ability to turn pre-change writes on
and off.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html





Re: [HACKERS] BETWEEN Node DROP COLUMN

2002-07-05 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 What happens if I drop an inherited column in a child table?  Maybe it
 works, but what happens when I SELECT the column in the parent table?

 Well, what happens if you rename a column in a child table?  Same problem?

Ideally we should disallow both of those, as well as cases like
changing the column type.

It might be that we can use the pg_depend stuff to enforce this (by
setting up dependency links from child to parent).  However that would
introduce a ton of overhead in a regular DROP TABLE, and you'd still
need specialized code to prevent the RENAME case (pg_depend wouldn't
care about that).  I'm thinking that it's worth adding an attisinherited
column to pg_attribute to make these rules easy to enforce.

regards, tom lane



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: [HACKERS] BETWEEN Node DROP COLUMN

2002-07-05 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 1. I'm going to prevent people from dropping the last column in their table.
 I think this is the safest option.  How do I check if there's any other non
 dropped columns in a table?  Reference code anywhere?

You look through the Relation's tupledesc and make sure there's at least
one other non-dropped column.

 2. What should I do about inheritance?  I'm going to implement it, but are
 there issues?  It will basically drop the column with the same name in all
 child tables.  Is that correct behaviour?

Yes, if the 'inh' flag is set.

If 'inh' is not set, then the right thing would be to drop the parent's
column and mark all the *first level* children's columns as
not-inherited.  How painful that would be depends on what representation
we choose for marking inherited columns, if any.

 3. I am going to initially implement the patch to ignore the behaviour and
 do no dependency checking.  I will assume that Rod's patch will handle that
 without much trouble.

Yeah, Rod was looking ahead to DROP COLUMN.  I'm still working on his
patch (mostly the pg_constraint side) but should have it soon.

regards, tom lane



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])





Re: [HACKERS] DROP COLUMN Progress

2002-07-05 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 So, say the 'b' column is renamed to 'dropped_2', then you can do this:

 select dropped_2 from tab;
 select tab.dropped_2 from tab;
 update tab set dropped_2 = 3;
 select * from tab where dropped_2 = 3;

 Where have I missed the COLUMN_IS_DROPPED checks???

Sounds like you aren't checking in the part of the parser that resolves
simple variable references.

 Another thing:  I don't want to name dropped columns 'dropped_...' as I
 think that's unfair on our non-English speaking users.  Should I just use
 '' or something?

Don't be silly --- the system catalogs are completely English-centric
already.  Do you want to change all the catalog and column names to
meaningless strings?  Since the dropped columns should be invisible to
anyone who's not poking at the catalogs, I don't see that we are adding
any cognitive load ...

regards, tom lane



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html





Re: [HACKERS] Issues Outstanding for Point In Time Recovery (PITR)

2002-07-05 Thread Zeugswetter Andreas SB SD


 Let me re-write it, and I'll post it in the next version. The section
 dealt with what to do when you have a valid restored controlfile from a
 backup system, which is in the DB_SHUTDOWNED state, and that points to a
 valid shutdown/checkpoint record in the log; only the checkpoint record
 happens not to be the last one in the log. This is a situation that
 could never happen now, but would in PITR.

But it would need to be restore's responsibility to set the flag to 
DB_IN_PRODUCTION, no?

 Even if we shutdown before we copy the file, we don't want a file that
 hasn't been written to in 5 weeks before it was backed up to require
 five weeks of old log files to recover. So we need to track that
 information somehow, because right now if we scanned the blocks in the
 file looking for at the page LSN's, we greatest LSN we would see might
 be much older than where it would be safe to recover from. That is the
 biggest problem, I think.

Well, if you skip a validity test it could be restore's responsibility 
to know which checkpoint was last before the file backup was taken. 
(When doing a backup you would need to include the last checkpoint info
== pg_control at start of backup)

Andreas



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





[HACKERS] pg_controldata

2002-07-05 Thread Thomas Lockhart

I modified pg_controldata to display a few new fields. Example output
appears at the end of this message, and the cvs log is:

Add a few new lines to display recently added fields in the ControlFile 
 structure.
Now includes the following new fields:
 integer/float date/time storage
 maximum length of names (+1; they must also include a null termination)
 maximum number of function arguments
 maximum length of locale name

   - Thomas


myst$ ./pg_controldata
pg_control version number:72
Catalog version number:   200207021
Database state:   IN_PRODUCTION
pg_control last modified: Fri Jul  5 08:33:18 2002
Current log file id:  0
Next log file segment:2
Latest checkpoint location:   0/1663838
Prior checkpoint location:0/16637F8
Latest checkpoint's REDO location:0/1663838
Latest checkpoint's UNDO location:0/0
Latest checkpoint's StartUpID:12
Latest checkpoint's NextXID:  4925
Latest checkpoint's NextOID:  139958
Time of latest checkpoint:Fri Jul  5 08:33:01 2002
Database block size:  8192
Blocks per segment of large relation: 131072
Maximum length of names:  32
Maximum number of function arguments: 16
Date/time type storage:   64-bit integers
Maximum length of locale name:128
LC_COLLATE:   C
LC_CTYPE: C



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





Re: [HACKERS] (A) native Windows port

2002-07-05 Thread Lamar Owen

On Wednesday 03 July 2002 12:09 pm, Bruce Momjian wrote:
 Hannu Krosing wrote:
  AFAIK I can run as many backends as I like (up to some practical limit)
  on the same comuter at the same time, as long as they use different
  ports and different data directories.

 We don't have an automated system for doing this.  Certainly it is done
 all the time.

Good.  Dialog.  This is better than what I am used to when I bring up 
upgrading. :-)

Bruce, pg_upgrade isn't as kludgey as what I have been doing with the RPMset 
for these nearly three years.

No, what I envisioned was a standalone dumper that can produce dump output 
without having a backend at all.  If this dumper knows about the various 
binary formats, and knows how to get my data into a form I can then restore 
reliably, I will be satisfied.  If it can be easily automated so much the 
better.  Doing it table by table would be ok as well.

I'm looking for a sequence such as:


PGDATA=location/of/data/base
TEMPDATA=location/of/temp/space/on/same/file/system

mv $PGDATA/* $TEMPDATA
initdb -D $PGDATA
pg_dbdump $TEMPDATA |pg_restore  {with its associated options, etc}

With an rm -rf of $TEMPDATA much further down the pike.

Keys to this working:
1.) Must not require the old version executable backend.  There are a number 
of reasons why this might be, but the biggest is due to the way much 
upgrading works in practice -- the old executables are typically gone by the 
time the new package is installed.

2.) Uses pg_dbdump of the new version.  This dumper can be tailored to provide 
the input pg_restore wants to see.  The dump-restore sequence has always had 
dumped-data version mismatch as its biggest problem -- there have been issues 
before where you would have to install the new version of pg_dump to run 
against the old backend.  This is unacceptable in the real world of binary 
packages.

One other usability note: why can't postmaster perform the steps of an initdb 
if -D points to an empty directory?  It's not that much code, is it?  (I know 
that one extra step isn't backbreaking, but I'm looking at this from a rank 
newbie's point of view -- or at least I'm trying to look at it in that way, 
as it's been a while since I was a rank newbie at PostgreSQL)  Oh well, just 
a random thought.

But I believe a backend-independent data dumper would be very useful in many 
contexts, particularly those where a backend cannot be run for whatever 
reason, but you need your data (corrupted system catalogs, high system load, 
whatever).  Upgrading is just one of those contexts.
-- 
Lamar Owen
WGCR Internet Radio
1 Peter 4:11





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster





Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am

2002-07-05 Thread Thomas Lockhart

 Actually, the big change is such that will, at least as far as I'm
 understanding it, break pretty much every front-end applicaiton ... which,
 I'm guessing, is pretty major, no? :)

I've always thought of our release numbering as having themes. The 6.x
series took Postgres from interesting but buggy to a solid system, with
a clear path to additional capabilities. The 7.x series fleshes out SQL
standards compliance and rationalizes the O-R features, as well as adds
to robustness and speed with WAL etc. And the 8.x series would enable
Postgres to extend to distributed systems etc., quite likely having some
fundamental restructuring of the way we handle sources of data (remember
our discussions a couple years ago regarding tuple sources?).

So I feel that bumping to 8.x just for schemas is not necessary. I
*like* the idea of having more than one or two releases in a series, and
would be very happy to see a 7.3 released.

   - Thomas



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am

2002-07-05 Thread Marc G. Fournier

On Fri, 5 Jul 2002, Thomas Lockhart wrote:

  Actually, the big change is such that will, at least as far as I'm
  understanding it, break pretty much every front-end applicaiton ... which,
  I'm guessing, is pretty major, no? :)

 I've always thought of our release numbering as having themes. The 6.x
 series took Postgres from interesting but buggy to a solid system, with
 a clear path to additional capabilities. The 7.x series fleshes out SQL
 standards compliance and rationalizes the O-R features, as well as adds
 to robustness and speed with WAL etc. And the 8.x series would enable
 Postgres to extend to distributed systems etc., quite likely having some
 fundamental restructuring of the way we handle sources of data (remember
 our discussions a couple years ago regarding tuple sources?).

 So I feel that bumping to 8.x just for schemas is not necessary. I
 *like* the idea of having more than one or two releases in a series, and
 would be very happy to see a 7.3 released.

Seems I'm the only one for 8.x, so 7.3 it is :)





---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





Re: [HACKERS] DROP COLUMN Progress

2002-07-05 Thread Bruce Momjian

Christopher Kings-Lynne wrote:
 OK,
 
 This is the problem I'm having with the DROP COLUMN implementation.  Since
 I've already incorporated all of Hiroshi's changes, I think this may have
 been an issue in his trial implementation as well.
 
 I have attached my current patch, which works fine and compiles properly.
 
 Ok, if you drop a column 'b', then all these work properly:
 
 select * from tab;
 select tab.* from tab;
 select b from tab;
 update tab set b = 3;
 select * from tab where b = 3;
 insert into tab (b) values (3);
 
 That's all good.  However, the issue is that one of the things that happens
 when you drop a column is that the column is renamed to 'dropped_%attnum%'.
 So, say the 'b' column is renamed to 'dropped_2', then you can do this:
 
 select dropped_2 from tab;
 select tab.dropped_2 from tab;
 update tab set dropped_2 = 3;
 select * from tab where dropped_2 = 3;
 
 Where have I missed the COLUMN_IS_DROPPED checks???

OK, my guess is that it is checks in parser/.  I would issue each of
these queries with a non-existant column name, find the error message in
the code, and add an isdropped check in those places.

 Another thing:  I don't want to name dropped columns 'dropped_...' as I
 think that's unfair on our non-English speaking users.  Should I just use
 '' or something?

I think dropped is OK.  The SQL is still English, e.g. DROP.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html





Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am

2002-07-05 Thread Bruce Momjian

Marc G. Fournier wrote:
 On Fri, 5 Jul 2002, Tom Lane wrote:
 
  Marc G. Fournier [EMAIL PROTECTED] writes:
   Actually, the big change is such that will, at least as far as I'm
   understanding it, break pretty much every front-end applicaiton ...
 
  Only those that inspect system catalogs --- I'm not sure what percentage
  that is, but surely it's not pretty much every one.  psql for example
  is only affected because of its \d commands.
 
 Okay, anyone have any ideas of other packages that would inspect the
 system catalog?  The only ones I could think of, off the top of my head,
 would be pgAccess, pgAdmin and phpPgAdmin ... but I would guess that any
 'administratively oriented' interface would face similar problems, no?

That's a good point.  Only the admin stuff is affected, not all
applications.  All applications _can_ now use schemas, but for most
cases applications remain working unchanged.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html





Re: [HACKERS] Proposal: CREATE CONVERSION

2002-07-05 Thread Bruce Momjian

Tatsuo Ishii wrote:
 Here is my proposal for new CREATE CONVERSION which makes it possible
 to define new encoding conversion mapping between two encodings on the
 fly.
 
 The background:
 
 We are getting having more and more encoding conversion tables. Up to
 now, they reach to 385352 source lines and over 3MB in compiled forms
 in total. They are statically linked to the backend. I know this
 itself is not a problem since modern OSs have smart memory management
 capabilities to fetch only necessary pages from a disk. However, I'm
 worried about the infinite growing of these static tables.  I think
 users won't love 50MB PostgreSQL backend load module.

Yes, those conversion tables are getting huge in the tarball too:

$ pwd
/pg/backend/utils/mb
$ du
4   ./CVS
7   ./Unicode/CVS
9541./Unicode
15805   .

Look at these two file alone:

-rw-r--r--  1 postgres  wheel  1427492 Jun 13 04:28 gb18030_to_utf8.map
-rw-r--r--  1 postgres  wheel  1427492 Jun 13 04:28 utf8_to_gb18030.map

If we can make these loadable, that would be good.  What would be really
interesting is if we could split these out into a separate
directory/project so development on those could take place in an
independent way.  This would probably stimulate even more encoding
options for users.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





Re: [HACKERS] (A) native Windows port

2002-07-05 Thread Bruce Momjian

Lamar Owen wrote:
 On Wednesday 03 July 2002 12:09 pm, Bruce Momjian wrote:
  Hannu Krosing wrote:
   AFAIK I can run as many backends as I like (up to some practical limit)
   on the same comuter at the same time, as long as they use different
   ports and different data directories.
 
  We don't have an automated system for doing this.  Certainly it is done
  all the time.
 
 Good.  Dialog.  This is better than what I am used to when I bring up 
 upgrading. :-)
 
 Bruce, pg_upgrade isn't as kludgey as what I have been doing with the RPMset 
 for these nearly three years.
 
 No, what I envisioned was a standalone dumper that can produce dump output 
 without having a backend at all.  If this dumper knows about the various 
 binary formats, and knows how to get my data into a form I can then restore 
 reliably, I will be satisfied.  If it can be easily automated so much the 
 better.  Doing it table by table would be ok as well.

The problem with a standalone dumper is that you would have to recode
this for every release, with little testing possible.  Having the old
backend active saves us that step.  If we get it working, we can use it
over and over again for each release with little work on our part.

 Keys to this working:
 1.)   Must not require the old version executable backend.  There are a number 
 of reasons why this might be, but the biggest is due to the way much 
 upgrading works in practice -- the old executables are typically gone by the 
 time the new package is installed.

Oh, that is a problem.  We would have to require the old executables.

 2.)   Uses pg_dbdump of the new version.  This dumper can be tailored to provide 
 the input pg_restore wants to see.  The dump-restore sequence has always had 
 dumped-data version mismatch as its biggest problem -- there have been issues 
 before where you would have to install the new version of pg_dump to run 
 against the old backend.  This is unacceptable in the real world of binary 
 packages.
 
 One other usability note: why can't postmaster perform the steps of an initdb 
 if -D points to an empty directory?  It's not that much code, is it?  (I know 
 that one extra step isn't backbreaking, but I'm looking at this from a rank 
 newbie's point of view -- or at least I'm trying to look at it in that way, 
 as it's been a while since I was a rank newbie at PostgreSQL)  Oh well, just 
 a random thought.

The issue is that if you have PGDATA pointed to the wrong place, it
creates a new instance automatically.  Could be strange for people, but
we could prompt them to run initdb I guess.

 But I believe a backend-independent data dumper would be very useful in many 
 contexts, particularly those where a backend cannot be run for whatever 
 reason, but you need your data (corrupted system catalogs, high system load, 
 whatever).  Upgrading is just one of those contexts.

Yes, but who wants to write one of those for every release?  That is
where we get stuck, and with our limited resources, it is desirable to
encourage people to work on it?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





Re: [HACKERS] Issues Outstanding for Point In Time Recovery (PITR)

2002-07-05 Thread Bruce Momjian

J. R. Nield wrote:
 On Fri, 2002-07-05 at 01:42, Bruce Momjian wrote:
  
  We have needed
  point-in-time recovery for a long time, 
 
 Most thanks should go to vadim (and whoever else worked on this), since
 his WAL code already does most of the work. The key thing is auditing
 the backend to look for every case where we assume some action is not
 visible until after commit, and therefore don't log its effects. Those
 are the main cases that must be changed.

Yep.  Glad you can focus on that.

  ---
  
  J. R. Nield wrote:
   Hello:
   
   I've got the logging system to the point where I can take a shutdown
   consistent copy of a system, and play forward through multiple
   checkpoints. It seems to handle CREATE TABLE/DROP TABLE/TRUNCATE
 
 But notably not for the btree indexes! It looked like they were working,
 because the files were there, and all indexes created before the backup
 would work under insert/delete (including sys catalog indexes). This is
 because btree insert/delete is logged, just not during build. So I
 missed that one case.
 
 You will end-up with up-to-date table data though, so it is something.
 
 Adding logging support to btbuild is the next step, and I don't think it
 should be too hard. I am working this now.

Great.

 It is also a major advantage that most everything in the system gets
 stored in the catalog tables, and so is logged already.
 
 
  Uh, we do log pre-page writes to WAL to recover from partial page
  writes to disk.  Is there something more we need here?
  
  As for bad block detection, we have thought about adding a CRC to each
  page header, or at least making it optional.  WAL already has a CRC.
 
 
 Yes this should be last to do, because it is not necessary for PITR,
 only for performance (the option not to write pre-images without fear of
 data loss). 

Yep.

  Yes, there are a few places where we actually create a file, and if the
  server crashes, the file remains out there forever.  We need to track that
  better.  
 
 OK, there is a bigger problem then just tracking the file though. We
 sometimes do stuff to that file that we don't log. We assume that if we
 commit, the file must be OK and will not need replay because the
 transaction would not have committed if the file was not in a commitable
 state. If we abort, the system never sees the file, so in a sense we
 undo everything we did to the file. It is a kind of poor-man's rollback
 for certain operations, like btbuild, create table, etc. But it means
 that we cannot recover the file from the log, even after a commit.

Yep.

  
   
   ?1.1.1 - CREATE DATABASE is also unlogged
   
 This will cause the same replay problems as above.
  
  Yep.  Again, seems a master cleanup on startup is needed.
 
 The cleanup is not the problem, only a nuisance. Creating the files
 during replay is the problem. I must recreate CREATE DATABASE from the
 log exactly as it was done originally. I think just logging the
 parameters to the command function should be sufficient, but I need to
 think more about it.

OK, makes sense.  Nice when you can bundle a complex action into the
logging of one command and its parameters.

  
   b) If TRUNCATE TABLE fails, the system must PANIC. Otherwise, the table
   may be used in a future command, and a replay-recovered database may
   end-up with different data than the original.
  
  We number based on oids.  You mean oid wraparound could cause the file
  to be used again?
 
 That's not what I meant. Let's say I issue 'TRUNCATE TABLE foo'. Then,
 right before smgrtruncate is called, I do an XLogInsert saying Redo a
 TRUNCATE TABLE on foo to nblocks if we crash. Then smgrtruncate fails
 and we do an elog(ERROR)
 
 Now the user decides that since TRUNCATE TABLE didn't work, he might as
 well use the table, so he inserts some records into it, generating log
 entries.
 
 When I replay this log sequence later, what happens if the TRUNCATE
 succeeds instead of failing?

You mean the user is now accessing a partially truncated table? That's
just too weird.  I don't see how the WAL would know how far truncation
had gone.  I see why you would need the panic and it seems acceptable.

 I admit that there are other ways of handling it than to PANIC if the
 truncate fails. All the ones I can come up with seem to amount to some
 kind of ad-hoc UNDO log.

Yea, truncate failure seems so rare/impossible to happen, we can do a
panic and see if it ever happens to anyone.  I bet it will not.  Those
are usually cases of an OS crash, so it is the same as a panic.

   WAL must be flushed before truncate as well.
   
   WAL does not need to be flushed before create, if we don't mind 
   leaking files sometimes.
  
  Cleanup?
 
 Yes, we could garbage-collect leaked files. XLogFlush is not that
 expensive though, so I don't have an opinion on this yet.

Right now, if we do CREATE TABLE, and the backend 

Re: [HACKERS] (A) native Windows port

2002-07-05 Thread Oliver Elphick

On Fri, 2002-07-05 at 17:39, Lamar Owen wrote:
 No, what I envisioned was a standalone dumper that can produce dump output 
 without having a backend at all.  If this dumper knows about the various 
 binary formats, and knows how to get my data into a form I can then restore 
 reliably, I will be satisfied.  If it can be easily automated so much the 
 better.  Doing it table by table would be ok as well.
...
 1.)   Must not require the old version executable backend.  There are a number 
 of reasons why this might be, but the biggest is due to the way much 
 upgrading works in practice -- the old executables are typically gone by the 
 time the new package is installed.
 
 2.)   Uses pg_dbdump of the new version.  This dumper can be tailored to provide 
 the input pg_restore wants to see.  The dump-restore sequence has always had 
 dumped-data version mismatch as its biggest problem -- there have been issues 
 before where you would have to install the new version of pg_dump to run 
 against the old backend.  This is unacceptable in the real world of binary 
 packages.

I concur completely!

As a package maintainer, this would remove my biggest problem.


Oliver Elphick
(Debian maintainer)





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster





Re: [HACKERS] Should next release by 8.0 (Was: Re: [GENERAL] I am

2002-07-05 Thread Alessio Bragadini

In my book, schema support is a big thing, leading to rethink a lot of
database organization and such. PostgreSQL 8 would stress this
importance.

-- 
Alessio F. Bragadini[EMAIL PROTECTED]
APL Financial Services  http://village.albourne.com
Nicosia, Cyprus phone: +357-22-755750

It is more complicated than you think
-- The Eighth Networking Truth from RFC 1925




---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: [HACKERS] BETWEEN Node DROP COLUMN

2002-07-05 Thread Bruce Momjian


The problem is that the new column is now part of pg_attribute so every
catalog/pg_attribute.h DATA() line has to be updated.  Did you update
them all with 'false' in the right slot?  Not sure what the chunks are.


---

Christopher Kings-Lynne wrote:
  Christopher, if you are still having trouble adding the isdropped system
  column, please let me know.
 
 Thanks Bruce, but I think I've got it sorted now.  One weird thing is that
 although I added it as being false in pg_attribute.h, I get these tuples
 having attisdropped set to true by initdb.
 
 Are these from the toasting process and maybe the stats or something??
 
 Chris
 
  attrelid |  attname
 --+---
 16464 | chunk_id
 16464 | chunk_seq
 16464 | chunk_data
 16466 | chunk_id
 16466 | chunk_seq
 16467 | chunk_id
 16467 | chunk_seq
 16467 | chunk_data
 16469 | chunk_id
 16469 | chunk_seq
 16470 | chunk_id
 16470 | chunk_seq
 16470 | chunk_data
 16472 | chunk_id
 16472 | chunk_seq
 16473 | chunk_id
 16473 | chunk_seq
 16473 | chunk_data
 16475 | chunk_id
 16475 | chunk_seq
 16476 | chunk_id
 16476 | chunk_seq
 16476 | chunk_data
 16478 | chunk_id
 16478 | chunk_seq
 16479 | chunk_id
 16479 | chunk_seq
 16479 | chunk_data
 16481 | chunk_id
 16481 | chunk_seq
 16482 | chunk_id
 16482 | chunk_seq
 16482 | chunk_data
 16484 | chunk_id
 16484 | chunk_seq
 16485 | chunk_id
 16485 | chunk_seq
 16485 | chunk_data
 16487 | chunk_id
 16487 | chunk_seq
 16488 | chunk_id
 16488 | chunk_seq
 16488 | chunk_data
 16490 | chunk_id
 16490 | chunk_seq
 16491 | usecreatedb
 16491 | usesuper
 16491 | passwd
 16491 | valuntil
 16491 | useconfig
 16494 | schemaname
 16494 | tablename
 16494 | rulename
 16494 | definition
 16498 | schemaname
 16498 | viewname
 16498 | viewowner
 16498 | definition
 16501 | tablename
 16501 | tableowner
 16501 | hasindexes
 16501 | hasrules
 16501 | hastriggers
 16504 | tablename
 16504 | indexname
 16504 | indexdef
 16507 | tablename
 16507 | attname
 16507 | null_frac
 16507 | avg_width
 16507 | n_distinct
 16507 | most_common_vals
 16507 | most_common_freqs
 16507 | histogram_bounds
 16507 | correlation
 16511 | relid
 16511 | relname
 16511 | seq_scan
 16511 | seq_tup_read
 16511 | idx_scan
 16511 | idx_tup_fetch
 16511 | n_tup_ins
 16511 | n_tup_upd
 16511 | n_tup_del
 16514 | relid
 16514 | relname
 16514 | heap_blks_read
 16514 | heap_blks_hit
 16514 | idx_blks_read
 16514 | idx_blks_hit
 16514 | toast_blks_read
 16514 | toast_blks_hit
 16514 | tidx_blks_read
 16514 | tidx_blks_hit
 16518 | relid
 16518 | indexrelid
 16518 | relname
 16518 | indexrelname
 16518 | idx_scan
 16518 | idx_tup_read
 16518 | idx_tup_fetch
 16521 | relid
 16521 | indexrelid
 16521 | relname
 16521 | indexrelname
 16521 | idx_blks_read
 16521 | idx_blks_hit
 16524 | relid
 16524 | relname
 16524 | blks_read
 16524 | blks_hit
 16527 | datid
 16527 | datname
 16527 | procpid
 16527 | usesysid
 16527 | usename
 16527 | current_query
 16530 | datid
 16530 | datname
 16530 | numbackends
 16530 | xact_commit
 16530 | xact_rollback
 16530 | blks_read
 16530 | blks_hit
 
 
 
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://archives.postgresql.org
 
 
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])





Re: [HACKERS] BETWEEN Node DROP COLUMN

2002-07-05 Thread Christopher Kings-Lynne

  Well, what happens if you rename a column in a child table?  
 Same problem?
 
 It merrily renames the column in the child table (I tried it).  When
 SELECTing the parent, bogus data appears.  Looks like a bug to me.
 Maybe the ALTER TABLE ...  RENAME COLUMN code should check for inherited
 columns before renaming them.

Hmmm...so how does one check if one is a child in an inheritance hierarchy?

Chris




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





Re: [HACKERS] BETWEEN Node DROP COLUMN

2002-07-05 Thread Bruce Momjian

Rod Taylor wrote:
  We could change pg_attribute to another name, and create a view called
  pg_attribute that never returned isdropped columns to the client.  That
  would allow clients to work cleanly, and the server to work cleanly.
 
 Another case where having an informational schema would eliminate the
 whole argument -- as the clients wouldn't need to touch the system
 tables.
 
 Any thoughts on that initial commit Peter?

From my new understanding, the client coders _want_ to see the isdropped
row so the attno's are consecutive.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org





Re: [HACKERS] (A) native Windows port

2002-07-05 Thread Dann Corbit

On generic recovery...

What is wrong with this strategy...

0. Put the database in single user mode.

1. Dump the Schema, with creation order properly defined, and with all
constraints written to a separate file.  (IOW, one file contains the
bare tables with no index, constraint or trigger stuff, and the other
contains all the RI stuff.)

2. Dump the tables (one by one) to text files with copy

3. Create a new database in a new location.

4. Feed it the bare table schema

5. Pump in the table data using copy from the saved text files

6. Run the RI script to rebuild index, trigger, PKey, FKey, etc.

I find that is the most trouble free way to do it with most DBMS
systems.

Am attempted dump from DBMS X.Y and a load to DBMS (X+1).Y is always a
pile of trouble waiting to happen -- no matter what the system is.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





Re: [HACKERS] Typo in htup.h comment

2002-07-05 Thread Bruce Momjian


Your patch has been added to the PostgreSQL unapplied patches list at:

http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---


Manfred Koizar wrote:
 Fix typo in xl_heaptid comment
 
 Servus
  Manfred
 
 --- ../base/src/include/access/htup.h 2002-07-04 18:05:04.0 +0200
 +++ src/include/access/htup.h 2002-07-05 16:52:44.0 +0200
 @@ -268,15 +268,15 @@
  /*
   * When we insert 1st item on new page in INSERT/UPDATE
   * we can (and we do) restore entire page in redo
   */
  #define XLOG_HEAP_INIT_PAGE 0x80
  
  /*
 - * All what we need to find changed tuple (18 bytes)
 + * All what we need to find changed tuple (14 bytes)
   *
   * NB: on most machines, sizeof(xl_heaptid) will include some trailing pad
   * bytes for alignment.  We don't want to store the pad space in the XLOG,
   * so use SizeOfHeapTid for space calculations.  Similar comments apply for
   * the other xl_FOO structs.
   */
  typedef struct xl_heaptid
  {
   RelFileNode node;
   ItemPointerData tid;/* changed tuple id */
  } xl_heaptid;
 
 Servus
  Manfred
 
 
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
 
 
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]





Re: [HACKERS] Page type and version

2002-07-05 Thread Bruce Momjian

Manfred Koizar wrote:
 As the upcoming release is breaking compatibility anyway:  what do you
 think about placing a magic number and some format version info into
 the page header?
 
 One 32-bit-number per page should be enough to encode page type and
 version.  We have just to decide, how we want it:
 
 a) combine page type and version into a single 32-bit magic number
 
 HEAPPAGE73 = 0x63c490c9
 HEAPPAGE74 = 0x540beeb3
 ...
 BTREE73= 0x8cdc8edb
 BTREE74= 0xbb13f0a1
 
 b) use n bits for the page type and the rest for a version number
 
 HEAPPAGE73 = 0x63c40703
 HEAPPAGE74 = 0x63c40704
 ...
 BTREE73= 0x8cdc0703
 BTREE74= 0x8cdc0704
 
 The latter has the advantage, that the software could easily check for
 a version range (e.g. if (PageGetVersion(page) = 0x0703) ...).

Yea, b) sounds good.

 One might argue, that one magic number *per file* should be
 sufficient.  That would mean, that the first page of a file had to
 have a different format.  Btree has such a meta page;  I don't know
 about the other access methods.

Heap used to have a header page too but it was removed long ago.

We do have the TODO item:

* Add version file format stamp to heap and other table types

but I am now questioning why that is there.  btree had a version stamp,
so I thought heap should have one too, but because the PG_VERSION file
is in every directory, isn't that all that is needed for version
information.

My vote is just to remove the btree version.  If we decide to implement
multi-version reading in the backend, we can add it where appropriate.


 With a magic number in every single page it could even be possible to
 do a smooth upgrade:  Just install Postgres 8.0 and continue to use
 your PostgreSQL 7.4 databases :-).  Whenever the backend reads an old
 format page it uses alternative accessor routines.  New pages are
 written in the new format.  Or the database can be run in
 compatibility mode ...  I'm dreaming ...

Yes, and as I understand, it is pretty easy from a tuple to snoop to the
end of the block to see what version stamp is there.  Will we ever use
it?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





[HACKERS] Thread discussion

2002-07-05 Thread Bruce Momjian

There was an interesting thread/process discussion in the gproff
Slashdot discussion:

http://slashdot.org/article.pl?sid=02/07/05/1457231mode=nestedtid=106

This guy had interesting comments:

http://slashdot.org/~pthisis/

Especially this comment:

http://slashdot.org/comments.pl?sid=35441cid=3829377

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])





Re: [HACKERS] Thread discussion

2002-07-05 Thread Dann Corbit

 -Original Message-
 From: Bruce Momjian [mailto:[EMAIL PROTECTED]]
 Sent: Friday, July 05, 2002 7:59 PM
 To: PostgreSQL-development
 Subject: [HACKERS] Thread discussion
 
 
 There was an interesting thread/process discussion in the gproff
 Slashdot discussion:
 
   
 http://slashdot.org/article.pl?sid=02/07/05/1457231mode=neste
dtid=106

This guy had interesting comments:

http://slashdot.org/~pthisis/

Especially this comment:

http://slashdot.org/comments.pl?sid=35441cid=3829377

==
Which is pretty much pointless MS bashing and incorrect.

From the news:comp.programming.threads FAQ:

 Q147: Thread create timings  

Matthew Houseman  writes:

Thought I'd throw this into the pyre. :)  I ran the thread/process
create
stuff on a 166MHz Pentium (no pro, no mmx) under NT4 and Solaris x86
2.6:


NT spawn240s24.0  ms/spawn
Solaris spawn (fork)123s12.3  ms/spawn  (incl. exec)
Solaris spawn (vfork)95s 9.5  ms/spawn  (incl. exec)

Solaris fork 47s 4.7  ms/fork
Solaris vfork0.37 ms/vfork  (37s/10)

NT thread create 12s 1.2  ms/create
Solaris thread create0.11 ms/create (11s/10)


As you can see, I tried both fork() and vfork(). When doing an immediate
exec(), you'd normally use vfork(); when just forking, fork() is usually
what you want to use (or have to use).

Note that I had to turn the number of creates up to 10 for vfork
and thread create to get better precision in the timings.


To remind you, here are greg's figures (on a Pentium MMX 200MHz):

NT Spawner (spawnl):120 Seconds (12.0 millisecond/spawn)
Linux Spawner (fork+exec):   57 Seconds ( 6.0 millisecond/spawn)

Linux Process Create (fork): 10 Seconds ( 1.0 millisecond/proc)

NT Thread Create  9 Seconds ( 0.9 millisecond/thread)
Linux Thread Create   3 Seconds ( 0.3 millisecond/thread)


Just for fun, I tried the same thing on a 2 CPU 170MHz Ultrasparc.
I leave it to someone else to figure out how much of this is due to
the two CPUs... :)

Solaris spawn (fork)84s 8.4  ms/spawn  (incl. exec)
Solaris spawn (vfork)   69s 6.9  ms/spawn  (incl. exec)

Solaris fork21s 2.1  ms/fork
Solaris vfork   0.17 ms/vfork  (17s/10)

Solaris thread create   0.06 ms/create (6s/10)


=TOP=
 Q148: Timing Multithreaded Programs (Solaris)  

From: [EMAIL PROTECTED] (Richard Sullivan)

I'm trying to time my multithreaded programs on Solaris with multiple 
processors.  I want the real world running time as opposed to the total

execution time of the programming because I want to measure speedup
versus 
sequential algorithms and home much faster the parallel program is for
the user.

Bradly,

  Here is what I wrote to solve this problem (for Solaris anyway).  To
use it just call iobench_start() after any setup that you don't want
to measure.  When you are done measuring call iobench_end().  When you
want to see the statistics call iobench_report().  The output to
stderr will look like this:

Process info:
  elapsed time  249.995
  CPU time  164.446
  user time 152.095
  system time   12.3507
  trap time 0.661235
  wait time 68.6506
  pfsmajor/minor3379/ 0
  blocks input/output  0/ 0
 
65.8% CPU usage

The iobench code is included in the program sources on: index.html.
=TOP=

My opinion is that PostgreSQL does not have to exclusively fork() or
exclusively thread.
As Spike Lee said:
Always do the right thing.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster





Re: [HACKERS] Thread discussion

2002-07-05 Thread Bruce Momjian

Dann Corbit wrote:
 Especially this comment:
 
   http://slashdot.org/comments.pl?sid=35441cid=3829377
 
 ==
 Which is pretty much pointless MS bashing and incorrect.

Is there such a thing.  ;-)

Anyway, the analysis of Solaris is meaningless.  It is in the same camp
as NT as far as process creation bloat.  I have always said threads help
on NT _and_ Solaris.

On Solaris, the thread popularity is there _because_ the OS is so slow
at process creation (SVr$ bloat), not necessarily because people really
want threads on Solaris.

 NT Spawner (spawnl):120 Seconds (12.0 millisecond/spawn)
 Linux Spawner (fork+exec):   57 Seconds ( 6.0 millisecond/spawn)
 
 Linux Process Create (fork): 10 Seconds ( 1.0 millisecond/proc)
 
 NT Thread Create  9 Seconds ( 0.9 millisecond/thread)
 Linux Thread Create   3 Seconds ( 0.3 millisecond/thread)

The Linux case is more interesting.  The same guy had timings for thread
vs. process of 6usecs vs. 4usecs, but states that it really isn't even a
blip on the performance radar, and the coding required to do the stuff
in a threaded manner is a headache:

http://slashdot.org/article.pl?sid=02/07/05/1457231tid=106

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html