Re: [HACKERS] XLogWrite: write request 0/53A4000 is past end of log 0/53A4000

2004-07-16 Thread Alvaro Herrera
On Fri, Jul 16, 2004 at 01:32:01AM -0400, lists wrote:

 LOG:  ReadRecord: unexpected pageaddr 0/33A4000 in log file 0, segment 
 5, offset 3817472
 LOG:  redo is not required
 PANIC:  XLogWrite: write request 0/53A4000 is past end of log 0/53A4000
 LOG:  startup process (pid 16068) was terminated by signal 6
 LOG:  aborting startup due to startup process failure
 
 this is 7.3.3 (PGDG rpm's) on red hat 7.2

Get 7.3.6 and try again.  IIRC this is a known bug that was repaired in
7.3.4.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Al principio era UNIX, y UNIX habló y dijo: Hello world\n.
No dijo Hello New Jersey\n, ni Hello USA\n.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] XLogWrite: write request 0/53A4000 is past end of log

2004-07-16 Thread lists
thank you for replying.
Alvaro Herrera wrote:
LOG:  ReadRecord: unexpected pageaddr 0/33A4000 in log file 0, segment 
5, offset 3817472
LOG:  redo is not required
PANIC:  XLogWrite: write request 0/53A4000 is past end of log 0/53A4000
LOG:  startup process (pid 16068) was terminated by signal 6
LOG:  aborting startup due to startup process failure

this is 7.3.3 (PGDG rpm's) on red hat 7.2
Get 7.3.6 and try again.  IIRC this is a known bug that was repaired in
7.3.4.
do you know if there are rpm's for RH7.2 available?
to get the data back, is it a matter of just upgrading and starting the 
db (ie will it perform recovery)?

thanks
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] schema dumping

2004-07-16 Thread Christopher Kings-Lynne
Hi,
Can I change pg_dump to never use the AUTHORIZATION clause and use OWNER 
TO instead?  It would make things a lot simpler, especially in the case 
when dumping that the public schema has had its ownership changed.

This is what the comment says:
* Note that ownership is shown in the AUTHORIZATION clause,
* while the archive entry is listed with empty owner (causing
* it to be emitted with SET SESSION AUTHORIZATION DEFAULT).
* This seems the best way of dealing with schemas owned by
* users without CREATE SCHEMA privilege.  Further hacking has
* to be applied for --no-owner mode, though!
We now have a better way, but it's non-standard.  Shall I change it?
Chris
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Converting epoch to timestamp

2004-07-16 Thread Christopher Kings-Lynne
SELECT EXTRACT(EPOCH FROM TIMESTAMP '2003-01-01 11:23:44');

Yeah, but I think Michael's question had to do with going the other
way (numeric to timestamp).
Sorry,
SELECT EXTRACT(TIMESTAMP FROM EPOCH '123412341234');
Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] XLogWrite: write request 0/53A4000 is past end of log

2004-07-16 Thread Christopher Kings-Lynne
 LOG:  ReadRecord: unexpected pageaddr 0/33A4000 in log file 0, segment
5, offset 3817472
LOG:  redo is not required
PANIC:  XLogWrite: write request 0/53A4000 is past end of log 0/53A4000
LOG:  startup process (pid 16068) was terminated by signal 6
LOG:  aborting startup due to startup process failure
this is 7.3.3 (PGDG rpm's) on red hat 7.2

Get 7.3.6 and try again.  IIRC this is a known bug that was repaired in
7.3.4.
That won't help him start up his DB though will it?  It will just 
prevent the problem in the future?  Is it the same fix as for me? 
Appending zeros to a file?

Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] XLogWrite: write request 0/53A4000 is past end of log 0/53A4000

2004-07-16 Thread Alvaro Herrera
On Fri, Jul 16, 2004 at 02:09:33PM +0800, Christopher Kings-Lynne wrote:
  LOG:  ReadRecord: unexpected pageaddr 0/33A4000 in log file 0, segment
 5, offset 3817472
 LOG:  redo is not required
 PANIC:  XLogWrite: write request 0/53A4000 is past end of log 0/53A4000
 LOG:  startup process (pid 16068) was terminated by signal 6
 LOG:  aborting startup due to startup process failure
 
 this is 7.3.3 (PGDG rpm's) on red hat 7.2
 
 
 Get 7.3.6 and try again.  IIRC this is a known bug that was repaired in
 7.3.4.
 
 That won't help him start up his DB though will it?

AFAIU it should ...

 It will just prevent the problem in the future?  Is it the same fix as
 for me?  Appending zeros to a file?

As far as I remember, your problem was much worse ... This is a strange
bug where the Xlog code would refuse to replay a record that starts just
at the start of the segment.  A later version deals with this situation
correctly.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Este mail se entrega garantizadamente 100% libre de sarcasmo.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] XLogWrite: write request 0/53A4000 is past end of log 0/53A4000

2004-07-16 Thread Alvaro Herrera
On Fri, Jul 16, 2004 at 01:55:13AM -0400, lists wrote:
 thank you for replying.
 
 Alvaro Herrera wrote:
 LOG:  ReadRecord: unexpected pageaddr 0/33A4000 in log file 0, segment 
 5, offset 3817472
 LOG:  redo is not required
 PANIC:  XLogWrite: write request 0/53A4000 is past end of log 0/53A4000
 LOG:  startup process (pid 16068) was terminated by signal 6
 LOG:  aborting startup due to startup process failure
 
 this is 7.3.3 (PGDG rpm's) on red hat 7.2
 
 Get 7.3.6 and try again.  IIRC this is a known bug that was repaired in
 7.3.4.
 
 do you know if there are rpm's for RH7.2 available?

No idea, sorry.

 to get the data back, is it a matter of just upgrading and starting the 
 db (ie will it perform recovery)?

That should do.

Depending on how much you value your data, however, it might be
advisable to make a backup of the data directory just in case ...

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura (Perelandra, CSLewis)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone

2004-07-16 Thread Dann Corbit
 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, July 15, 2004 10:03 PM
 To: Dann Corbit
 Cc: Oliver Jowett; Magnus Hagander; Hackers; 
 [EMAIL PROTECTED]
 Subject: Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone 
 
 
 Dann Corbit [EMAIL PROTECTED] writes:
  All translations between UTC time and local time are based on the 
  following formula:
 
  UTC = local time + bias
 
 Surely not.  Or has Windows not heard of daylight-savings 
 time? Or perhaps they have, but are not aware that the DST 
 laws have changed often in the past?

No problems.  They even handle time zones with arbitrary minute
boundaries (not on the hour) with aplomb.

 Over-simplistic answers are not what we need here.  The data 
 structure you quote cannot even tell what DST transition 
 dates Windows thinks are in effect this year, let alone what 
 it thinks the dates were in past years.

Yes, there are other calls for that, obviously.  I sent to Mr. Momjian a
complete implementation of time zone stuff that uses Windows calls.
It's also accurate to a fraction of a nanosecond millions of years into
the past and the future.

The call that I showed returns the NAME OF THE TIME ZONE and also what
it is called when you are in Daylight savings time.  I thought the issue
under question was to find out what the time zone was.

This program:

#include windows.h
#include iostream
using namespace std;

int main(void)
{
TIME_ZONE_INFORMATION tz;
DWORD   i = GetTimeZoneInformation(tz);
for (i = 0; i  32  tz.StandardName[i]; i++)
   cout   (TCHAR) tz.StandardName[i];
cout  endl;
for (i = 0; i  32  tz.DaylightName[i]; i++)
   cout   (TCHAR) tz.DaylightName[i];
cout  endl;
return 0;
}

Prints this:
Pacific Standard Time
Pacific Daylight Time

There is also a global variable called _tzname that contains the name of
the time zone.
See:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/h
tml/_crt__daylight.2c_._timezone.2c_.and__tzname.asp

On my machine:
cout  _tzname[0];  // -  PST
cout  _tzname[1];  // -  PDT


As far as doing the calculations for time values, do whatever you like
(it's not that difficult either, and the code I send does address all
that stuff, though it is in C++).  Don't forget that things are inverted
in the southern hemisphere.

Have a good one.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Converting epoch to timestamp

2004-07-16 Thread Michael Glaesemann
On Jul 16, 2004, at 3:08 PM, Christopher Kings-Lynne wrote:
SELECT EXTRACT(TIMESTAMP FROM EPOCH '123412341234');
That's a really interesting idea! Makes for a much more consistent 
syntax for our other functions. ISTM this might require a native EPOCH 
datatype. I wouldn't want to encourage people to use EPOCH as a 
datatype though.

Michael Glaesemann
grzm myrealbox com
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 04:49, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Fri, 2004-07-16 at 00:01, Alvaro Herrera wrote:
  My manpage for signal(2) says that you shouldn't assign SIG_IGN to
  SIGCHLD, according to POSIX.
 
  So - I should be setting this to SIG_DFL and thats good for everyone?
 
 Yeah, we learned the same lesson in the backend not too many releases
 back.  SIG_IGN'ing SIGCHLD is bad voodoo; it'll work on some platforms
 but not others.

Many thanks all, Best Regards Simon Riggs





---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Bruce Momjian wrote:
Andreas Pflug wrote:
You do something that splits the value into directory name and file name
and removes every letter after %.
/var/log
postgresql.log.%-%-%_%%%
Another idea is to allow filename wildcards in the listing so it
becomes:
SELECT *
FROM dir_listing('/var/log/postgresql.log.*-*-*_***') AS dir
While that is nice, it doesn't match the functionality of opendir so we
are perhaps better with one that doesn't handle wildcards and we just do
the wildcard processing in the WHERE clause.
Uh, this looks ugly.
How about
pg_logfile_list() RETURNS setof timestamp -- to list available logfiles
pg_logfile_filename(timestamp) to return filename for that logfile

I don't see the need to return timestamps. If you select any empty
directory, you can just return the file names.  The only reason you
might need a pattern is to distinguish pg log files from other log
files.  If you want, create a server-side function that returns the file
name with the strftime() patterns converted to '*'.

and generic
pg_dir(wildcard_text)

Maybe pg_dir_ls().
OK, it would be nice if we could do a sed operation like:
sed 's/%./*/g'
but I don't know a way to do that without defining a function or pulling
in a procedural language, but if we could do it we could do:
pg_dir(echo log_destination | sed 's/%./*/g')
Ar ever used sed on win32?!? And how should the timestamp be 
represented in client tools? Date/time interpretation is always a source 
of problems, so *please* let the server do that.

Rethinking all this, I'd like the pg_logfile_list to return a complex type:
CREATE TYPE pg_logfile_list AS (
filedate timestamp,
filename text,
backendpid int,
inuse bool)
and
pg_logfile_list() RETURNS SETOF pg_logfile_list
which would enable
SELECT  filename,
pg_file_unlink(filename)
  FROM  pg_logfile_list()
 WHERE  filedate  current_timestamp - '3 months'::interval
AND NOT inuse
In-use check is easy for the backend, if the syslog process publishes 
the current logfile's timestamp in sharedmem.

We can use a GUC variable for the log_directory (not log_destination); 
anyway, I'd like the filenames to be selected by the server.

Regards,
Andreas
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone

2004-07-16 Thread Magnus Hagander
  It occurs to me that with a check this thorough, we might 
 be able to 
  finesse the problem on Windows with the system returning very 
  nonstandard timezone abbreviations.
 
  It does *not* pick up my timezone.
 
 Drat.  I assume from your domain name that Europe/Stockholm 
 would actually be the best choice for you?  What Windows 
 timezone setting are you using for this test?

Yup, that would be the best. Either that or CET/CEST.

I'm using the timezone that in the GUI is called (GMT +01:00)
Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna. srtftime with %z
calls it W. Europe Daylight Time.


  The following possibilities are rejected per:
  DEBUG:  Reject TZ Europe/Stockholm: at 16844400 
 1970-07-15 00:00:00 
  std versus 1970-07-15 01:00:00 dst
 
 If you look in src/timezone/data/europe you will see that the 
 zic database thinks Sweden was on strict GMT+1 (no daylight 
 savings) between
 1916 and 1980, and since 1980 they were on EU daylight-savings rules.
 Does that square with your ideas of reality?  (If it does not 
 then we should just punt the problem upstream to the zic 
 people, but I will assume here that their research is good.)

Actually, I was sure that was wrong, but started looking it up, and it
seems it's about right. See
http://www.sp.se/metrology/timefreq/eng/standard_time.htm for an
official take on the summertime rules. I *think* that is the EU rules.


 What I suspect given the above is that Windows has no clue 
 about historical reality and is retroactively applying the 
 current DST rules back to 1970, thus deciding that 1970-07-15 
 was on DST when it was really not.

That could definitly be. Since before 1970 they say there is no DST at
all (remember the old issues), it wouldn't surprise me a bit if they
took this kind of shortcut.


 I thought about restricting the scope of the TZ testing to 
 start in 1990 or so to avoid this, but that seems certain to 
 fall foul of the other problem, which is distinguishing 
 closely-related timezones (cf Chris K-L discovering that he 
 lives in Antarctica, a few days back...)
 
 Maybe the whole match-on-behavior approach is wrong and we 
 need to do something else, but I'm really unsure what.  Ideas?

Well, Windows specific we could do a translate table. SInce there is a
finite (and not *huge*) number of timezones. But we'd have to figure out
for each one which to match. But that won't work for unix I think - the
lookup table would be just huge considwering all possible
combinatinos...

//Magnus

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] analyze.c

2004-07-16 Thread Anja Klein
Hi guys,

during my study i have to du some practical work. i must extend postgresql with some 
sampling algorithms. therefore i implemented some trigger functions. it works fine.
but now i have to do some changes concerning the parser, particularly analyze.c . if a 
user sends an create table command, postgresql should create several sample tables, 
which will be filled wih the sample data from the original table. for the beginning, 
it would be enough, if two tables are created, the original one, with table name, 
columns etc. selectd by the user and second one, a copy with same columns etc, but 
another name e.g. sample.

my problem is, that the system catalogs are created at initdb. then the following 
failure occurs:

loading pg_description... /usr/local/pqsql/initdb: line 837: 22348 Done (141)
   (cat EOF
   CREATE TEMP TABLE tmp_pg_description (objoid oid, classname name, objsuboid 
int4,  description text)
   WITHOUT OIDS;
   COPY tmp_pg_description FROM STDIN;
EOF
   cat  $POSTGRES_DESC; cat
   EOF
\.
  INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsuboid, 
t.description FROM tmp_pg_description t,
  pg_class c WHERE c.relname = t.classname;
  EOF)
22349 Segmentation fault  $PGPATH/postgres $PGSQL_OPT template1/dev/null

how can i tell postgres. that it should only create the sample table, when the 
original table to be created has the name origin or something like that? or that it 
should only act, if some user wants to create the table, not postgres itself? 

thank you very much!

greetings, anja
___
WEB.DE Video-Mail - Sagen Sie mehr mit bewegten Bildern
Informationen unter: http://freemail.web.de/?mc=021199


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD

  I'm aiming for the minimum feature set - which means we do need to take
  care over whether that set is insufficient and also to pull any part
  that doesn't stand up to close scrutiny over the next few days.
 
 As you can see, we are still chewing on NT.  What PITR features are
 missing?  I assume because we can't stop file system writes during
 backup that we will need a backup parameter file like I described.  Is
 there anything else that PITR needs?

No, we don't need to stop writes ! Not even to split a mirror,
other db's need that to be able to restore, but we dont.
We only need to tell people to backup pg_control first. The rest was only 
intended to enforce 
1. that pg_control is the first file backed up
2. the dba uses a large enough PIT (or xid) for restore

I think the idea with an extra file with WAL start position was overly
complicated, since all you need is pg_control (+ WAL end position to enforce 2.).

If we don't want to tell people to backup pg_control first, imho the next 
best plan would be to add a WAL start input (e.g. xlog name) parameter 
to recovery.conf, that fixes pg_control.

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone

2004-07-16 Thread Magnus Hagander
  Over-simplistic answers are not what we need here.  The 
 data structure 
  you quote cannot even tell what DST transition dates Windows thinks 
  are in effect this year, let alone what it thinks the dates were in 
  past years.
 
 Yes, there are other calls for that, obviously.  I sent to 
 Mr. Momjian a complete implementation of time zone stuff that 
 uses Windows calls.
 It's also accurate to a fraction of a nanosecond millions of 
 years into the past and the future.

You wouldn't happen to know if Windows internally has knowledge of the
many different DST rules in force at different years? See the other mail
about how it apparantly deals with Swedish DST - if that's common or a
single case?
If the OS doesn't have that knowledge, we can give up trying to get it
to tell us about it :-)


 The call that I showed returns the NAME OF THE TIME ZONE and 
 also what it is called when you are in Daylight savings time. 
  I thought the issue under question was to find out what the 
 time zone was.

Nope, we already had that. The issue is that the names are not the same
as the one used in zic/unix, so there is nothing to match on.

//Magnus


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone

2004-07-16 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 I thought the issue under question was to find out what the 
 time zone was.

 Nope, we already had that. The issue is that the names are not the same
 as the one used in zic/unix, so there is nothing to match on.

Right.  The problem we are actually faced with is to identify which of
the zic timezones is the best match for the system's timezone setting.
One of the issues is that it's not clear what best means...

At the moment I like Oliver Jowett's idea of defining best as the one
that matches furthest back.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Tom Lane
Andreas Pflug [EMAIL PROTECTED] writes:
 In-use check is easy for the backend, if the syslog process publishes 
 the current logfile's timestamp in sharedmem.

You really haven't absorbed any of the objections I've raised, have you?
I don't want the log process connected to shared mem at *all*, and see
no particularly good reason why it should be.

 We can use a GUC variable for the log_directory (not log_destination); 
 anyway, I'd like the filenames to be selected by the server.

The directory should definitely be a GUC variable.  The individual
filenames should probably be of the form prefixtimestamp, where
the server dictates the format of the timestamp (and we choose it so
that the names sort correctly).  We could let the prefix be
user-selectable or make it hard-wired; I don't have a strong feeling
about that either way.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Tom Lane
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 We only need to tell people to backup pg_control first. The rest was only 
 intended to enforce 
 1. that pg_control is the first file backed up
 2. the dba uses a large enough PIT (or xid) for restore

Right, but I think Bruce's point is that it is far too easy to get those
things wrong; especially point 2 for which a straight tar dump will
simply not contain the information you need to determine what is a safe
stopping point.

I agree with Bruce that we should have some mechanism that doesn't rely
on the DBA to get this right.  Exactly what the mechanism should be is
certainly open for discussion...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
  OK, it would be nice if we could do a sed operation like:
  
  sed 's/%./*/g'
  
  but I don't know a way to do that without defining a function or pulling
  in a procedural language, but if we could do it we could do:
  
  pg_dir(echo log_destination | sed 's/%./*/g')
  
 
 Ar ever used sed on win32?!? And how should the timestamp be 
 represented in client tools? Date/time interpretation is always a source 
 of problems, so *please* let the server do that.

I am thinking of these all being server-side functions.

 Rethinking all this, I'd like the pg_logfile_list to return a complex type:
 
 CREATE TYPE pg_logfile_list AS (
   filedate timestamp,
   filename text,
   backendpid int,
   inuse bool)
 
 and
 
 pg_logfile_list() RETURNS SETOF pg_logfile_list
 
 which would enable
 
 SELECT  filename,
  pg_file_unlink(filename)
FROM  pg_logfile_list()
   WHERE  filedate  current_timestamp - '3 months'::interval
  AND NOT inuse
 
 In-use check is easy for the backend, if the syslog process publishes 
 the current logfile's timestamp in sharedmem.
 
 We can use a GUC variable for the log_directory (not log_destination); 
 anyway, I'd like the filenames to be selected by the server.

This seems quite involved.  Can we get the basic functionality I
described first?  Also I am not sure how all this information is going
to be passed from the logging process to the backend requesting the
information, and it seems overly complicated.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian
 Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
  We only need to tell people to backup pg_control first. The rest was only 
  intended to enforce 
  1. that pg_control is the first file backed up
  2. the dba uses a large enough PIT (or xid) for restore
 
 Right, but I think Bruce's point is that it is far too easy to get those
 things wrong; especially point 2 for which a straight tar dump will
 simply not contain the information you need to determine what is a safe
 stopping point.
 
 I agree with Bruce that we should have some mechanism that doesn't rely
 on the DBA to get this right.  Exactly what the mechanism should be is
 certainly open for discussion...

Right.  I am wondering what process people would use to backup
pg_control first?  If they do:

tar -f $TAPE ./global/pg_control .

They will get two copies or pg_control, the early one, and one as part
of the directory scan.  On restore, they would restore the early one,
but the directory scan would overwrite it.  I suppose they could do:

cp global/pg_control global/pg_control.backup
tar -f $TAPE .

then on restore once all the files are restored move the
pg_control.backup to its original name.  That gives us the checkpoint
wal/offset but how do we get the start/stop information.  Is that not
required?  Maybe we should just have a start/stop server-side functions
that create a file in the archive directory describing the start/stop
counters and time and the admin would then have to find those values.
Why are the start/stop wal/offset values needed anyway?  I know why we
need the checkpoint value.  Do we need a checkpoint after the archiving
starts but before the backup begins?

Also, when you are in recovery mode, how do you get out of recovery
mode, meaning if you have a power failure, how do you prevent the system
from doing another recovery?  Do you remove the recovery.conf file?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-hackers-win32] [HACKERS] Weird new time zone

2004-07-16 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 At the moment I like Oliver Jowett's idea of defining best 
 as the one that matches furthest back.

 Sounds reasonable to me. As long as a clear warning is put in the log
 whenever something is picked that is not a perfect match,

Define perfect match.  I do not think we can really tell if we have an
exact match or not; the libc timezone API is just too limited to be
sure.  And on many platforms we can be sure we will never have an exact
match, especially if we look at years before 1970.

If you want something in the log I'd be inclined to just always make a
log entry when we infer a timezone setting.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Also, when you are in recovery mode, how do you get out of recovery
 mode, meaning if you have a power failure, how do you prevent the system
 from doing another recovery?  Do you remove the recovery.conf file?

I do not care for the idea of a recovery.conf file at all, and have been
intending to look to see what we'd need to do to not have one.  I find
it hard to believe that there is anything one would put in it that is
really persistent state.  The above concern shows why it shouldn't be
treated as a persistent configuration file.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD

 then on restore once all the files are restored move the
 pg_control.backup to its original name.  That gives us the checkpoint
 wal/offset but how do we get the start/stop information.  Is that not
 required?

The checkpoint wal/offset is in pg_control, that is sufficient start 
information. The stop info is only necessary as a safeguard.

 Do we need a checkpoint after the archiving
 starts but before the backup begins?

No.

 Also, when you are in recovery mode, how do you get out of recovery
 mode, meaning if you have a power failure, how do you prevent the system
 from doing another recovery?  Do you remove the recovery.conf file?

pg_control could be updated during rollforward (only if that actually 
does a checkpoint). So if pg_control is also the recovery start info, then 
we can continue from there if we have a power failure.
For the first release it would imho also be ok to simply start over if
you loose power.

I think the filename 'recovery.conf' is misleading, since it is not a 
static configuration file, but a command file for one recovery.
How about 'recovery.command' then 'recovery.inprogress', and on recovery 
completion it should be renamed to 'recovery.done'

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Tom Lane
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 Do we need a checkpoint after the archiving
 starts but before the backup begins?

 No.

Actually yes.  You have to start at a checkpoint record when replaying
the log, so if no checkpoint occurred between starting to archive WAL
and starting the tar backup, you have a useless backup.

It would be reasonable to issue a CHECKPOINT just before starting the
backup as part of the standard operating procedure for taking PITR
dumps.  We need not require this, but it would help to avoid this
particular sort of mistake; and of course it might save a little bit of
replay effort if the backup is ever used.

As far as the business about copying pg_control first goes: there is
another way to think about it, which is to copy pg_control to another
place that will be included in your backup.  For example the standard
backup procedure could be

1. [somewhat optional] Issue CHECKPOINT and wait till it finishes.

2. cp $PGDATA/global/pg_control $PGDATA/pg_control.dump

3. tar cf /dev/mt $PGDATA

4. do something to record ending WAL position

If we standardized on this way, then the tar archive would automatically
contain the pre-backup checkpoint position in ./pg_control.dump, and
there is no need for any special assumptions about the order in which
tar processes things.

However, once you decide to do things like that, there is no reason why
the copied file has to be an exact image of pg_control.  I claim it
would be more useful if the copied file were plain text so that you
could just cat it to find out the starting WAL position; that would
let you determine without any special tools what range of WAL archive
files you are going to need to bring back from your archives.

This is pretty much the same chain of reasoning that Bruce and I went
through yesterday to come up with the idea of putting a label file
inside the tar backups.  We concluded that it'd be worth putting
both the backup starting time and the checkpoint WAL position into
the label file --- the starting time isn't needed for restore but
might be really helpful as documentation, if you needed to verify
which dump file was which.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD

  Do we need a checkpoint after the archiving
  starts but before the backup begins?
 
  No.
 
 Actually yes.

Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-(
So yes, you need one checkpoint after archiving starts. Imho turning on xlog
archiving should issue such a checkpoint just to be sure. 

Andreas

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Bruce Momjian wrote:
This seems quite involved.  Can we get the basic functionality I
described first?
On the way.
Also I am not sure how all this information is going
to be passed from the logging process to the backend requesting the
information, and it seems overly complicated.
There's *no* information passing from the logging process, with the 
single exception of the latest logfile timestamp (if allowed). I'd 
rather like to have that information from the logger, to be safe in case 
the system time was manipulated and the last logfile is not the current one.
The rest is just a reworked version of pg_dir_ls, with internal 
knowledge of how the timestamp is formatted.

Regards,
Andreas


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Tom Lane wrote:
Andreas Pflug [EMAIL PROTECTED] writes:
In-use check is easy for the backend, if the syslog process publishes 
the current logfile's timestamp in sharedmem.

You really haven't absorbed any of the objections I've raised, have you?
I don't want the log process connected to shared mem at *all*, and see
no particularly good reason why it should be.
Why shouldn't the process announce the logfile timestamp and its pid 
*writeonly*, so other backends know about it?

At least the pid must be distributed like this, just as bgwriter does.
I understand perfectly that postmaster and logger are very critical 
processes, so they should be dependent on as few resources as possible. 
The logger works without shmem in general, but how to reach it if its 
pid is unknown?


The directory should definitely be a GUC variable.  The individual
filenames should probably be of the form prefixtimestamp, where
the server dictates the format of the timestamp (and we choose it so
that the names sort correctly).  We could let the prefix be
user-selectable or make it hard-wired; I don't have a strong feeling
about that either way.
Agreed.
Regard,
Andreas
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 16:58, Zeugswetter Andreas SB SD wrote:
   Do we need a checkpoint after the archiving
   starts but before the backup begins?
  
   No.
  
  Actually yes.
 
 Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-(
 So yes, you need one checkpoint after archiving starts. Imho turning on xlog
 archiving should issue such a checkpoint just to be sure. 
 

By agreement, archive_mode can only be turned on at postmaster startup,
which means you always have a checkpoint - either because you shut it
down cleanly, or you didn't and it recovers, then writes one.

There is always something to start the rollforward. 

So, non-issue.

Best regards, Simon Riggs


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 15:27, Bruce Momjian wrote:

 Also, when you are in recovery mode, how do you get out of recovery
 mode, meaning if you have a power failure, how do you prevent the system
 from doing another recovery?  Do you remove the recovery.conf file?

That was the whole point of the recovery.conf file:
it prevents you from re-entering recovery accidentally, as would occur
if the parameters were set in the normal .conf file.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 16:25, Zeugswetter Andreas SB SD wrote:

 I think the filename 'recovery.conf' is misleading, since it is not a 
 static configuration file, but a command file for one recovery.
 How about 'recovery.command' then 'recovery.inprogress', and on recovery 
 completion it should be renamed to 'recovery.done'

You understand this and your assessment is correct.

recovery.conf isn't an attempt to persist information. It is a means of
delivering a set of parameters to the recovery process, as well as
signalling overall that archive recovery is required (because the system
default remains the same, which is to recover from the logs it has
locally available to it).

I originally offered a design which used a command, similar to
DB2/Oracle...that was overruled as too complex. The (whatever you call
it) file is just a very simple way of specifying whats required.

There is more to be said here...clearly some explanations are required
and I will provide those later...

Best regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] analyze.c

2004-07-16 Thread Peter Eisentraut
Anja Klein wrote:
 how can i tell postgres. that it should only create the sample table,
 when the original table to be created has the name origin or
 something like that? or that it should only act, if some user wants
 to create the table, not postgres itself?

You would presumably do a strcmp() on the name of the table to be 
created before you jump into your new code that creates the sample 
table.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
 Bruce Momjian wrote:
 
  
  This seems quite involved.  Can we get the basic functionality I
  described first?
 
 On the way.
 
  Also I am not sure how all this information is going
  to be passed from the logging process to the backend requesting the
  information, and it seems overly complicated.
 
 There's *no* information passing from the logging process, with the 
 single exception of the latest logfile timestamp (if allowed). I'd 
 rather like to have that information from the logger, to be safe in case 
 the system time was manipulated and the last logfile is not the current one.
 The rest is just a reworked version of pg_dir_ls, with internal 
 knowledge of how the timestamp is formatted.

Oh, so you are hardcoding the logfile name so you can interpret the
timestamp from that?  It seems cleaner to allow the admin to specify
whatever log pattern the want.

However, you idea of expiring the log files based on timestamp values is
pretty powerful.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
 Bruce Momjian wrote:
 
  
  This seems quite involved.  Can we get the basic functionality I
  described first? 
 
 Current workable patch.
 
 Some questions/limitations:
 - How's the official way to restrict pg_* functions to superuser only

Very crudely  :-)

static int pg_signal_backend(int pid, int sig)
{
if (!superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 (errmsg(only superuser can signal other backends;

 - I've restricted pg_file_read to 50k max. What's a reasonable limit for 
 a generic function?

Uh, that seems fine.  You already check to see it is within the limit. 
I think a bigger question is should we limit it at all?  Do we limit
pg_largeobject?  Is that similar?

 - pg_file_read and pg_file_write read/write text only; should have 
 binary versions too.

I guess we could but no one is asking for that yet so I would leave it
for later.

 Very open question:
 - How should a backend know the logger's pid if it's not in shmem. Write 
 a magic string to the pipe?

I think it has to and in fact the pid is being written by the
postmaster, not by the logger process, so that should be OK.  The issue
is that the logger shouldn't _attach_ to shared memory unless it has to.

As far as recording the current log timestamp, I think that will be a
problem.  I would much rather see us forget about doing timestamp
processing with these log files and keep it simple at this point and see
what needs we have for 7.6.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian
Simon Riggs wrote:
 On Fri, 2004-07-16 at 16:58, Zeugswetter Andreas SB SD wrote:
Do we need a checkpoint after the archiving
starts but before the backup begins?
   
No.
   
   Actually yes.
  
  Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-(
  So yes, you need one checkpoint after archiving starts. Imho turning on xlog
  archiving should issue such a checkpoint just to be sure. 
  
 
 By agreement, archive_mode can only be turned on at postmaster startup,
 which means you always have a checkpoint - either because you shut it
 down cleanly, or you didn't and it recovers, then writes one.
 
 There is always something to start the rollforward. 
 
 So, non-issue.

I don't think so.  I can imagine many cases where you want to do a
nightly tar backup without turning archiving on/off or restarting the
postmaster.  In those cases, a manual checkpoint would have to be issued
before the backup begins.

Imagine a system that is up for a month, and they don't have enough
archive space to keep a months worth of WAL files.  They would probably
do nightly or weekend tar backups, and then discard the WAL archives.

What procedure would they use?  I assume they would copy all their old
WAL files to a save directory, issue a checkpoint, do a tar backup, then
they can delete the saved WAL files.  Is that correct?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Memory management with C aggregate

2004-07-16 Thread Ian Burrell
Tom Lane wrote:
Ian Burrell [EMAIL PROTECTED] writes:
We have user-defined aggregates written in C running inside the server. 
 We are running into some memory management issues and wondering what 
is the best way to solve the problem.

The state of the aggregates is a structure with a pointer to allocated 
memory.  The structure and memory are allocated in the 
PortalMemoryContext and freed in the ffunc.  We just discovered that the 
ffunc function can be called multiple times with the same state on 
certain queries.  The double free causes a crash.

AFAICS the ffunc should be called only once per aggregated group.
Perhaps your code is confused about the handling of groups?  If so,
the double free is hardly your only worry --- you'll be computing wrong
answers anyway till you fix that.
The aggregate is in a subquery which might make a difference.  The ffunc 
is only called multiple times when a nested loop is used in the query 
plan.  With other query plans, the ffunc is only called once.  The ffunc 
seems to be called once for every combination.

I can't get the following query to use a nested loop, but our query is 
similar, but with a compound key and a custom aggregate.  If x has N 
rows, y has M, with N groups, then the ffunc can be called N*M times, M 
times for each N group.

SELECT x.key_no, s.agg
FROM x
INNER JOIN (
SELECT y.key_no, custom_agg(num) AS agg
FROM y
GROUP BY key_no
) s
USING (key_no)
ORDER BY key_no
I'll try to come up with a test case that illustrates the problem.
 - Ian

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] case sensitivity in PQExecPrepared

2004-07-16 Thread Merlin Moncure
I noticed a curious thing (not sure if by design or not).

While using the PQExecPrepared interface, the statement name passed to
the function (as a const char*) has to be in lower case to get it to
work.  I kind of understand why this is, but it is kind of weird that
passing the exact same statement name to PQExecPrepared and PREPARE does
not work if it contains any upper case characters.

Just FYI.
Merlin


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Bruce Momjian wrote:
- How's the official way to restrict pg_* functions to superuser only

Very crudely  :-)
Got it.
'nother question: Is reading the logfile a task that may be allowed to 
superusers only? I don't think so, though acls might apply.


Uh, that seems fine.  You already check to see it is within the limit. 
I think a bigger question is should we limit it at all?  Do we limit
pg_largeobject?  Is that similar?
Ok, no limit (but a default maximum of 50k remains). And since it's 
superuser only, he hopefully knows what he does.


Very open question:
- How should a backend know the logger's pid if it's not in shmem. Write 
a magic string to the pipe?

I think it has to and in fact the pid is being written by the
postmaster, not by the logger process, so that should be OK.  The issue
is that the logger shouldn't _attach_ to shared memory unless it has to.
It doesn't. It inherits the unnamed shared mem segment from the 
postmaster, as all subprocesses.

As far as recording the current log timestamp, I think that will be a
problem.  I would much rather see us forget about doing timestamp
processing with these log files and keep it simple at this point and see
what needs we have for 7.6.
I'm a bit insisting on this point. Remember, this all started from the 
attempt to display the serverlog on the client side. To do this, I need 
a way to retrieve the current logfile properties (size, and in case of 
rotation timestamp too) in a low-overhead way, or at least get to know 
something has changed. Scanning a whole directory and interpreting the 
data isn't low overhead any more.

There's no locking on the shmem, and the single dependence on shmem is 
the existence of it at the time of rotation. If the shmem is gone, 
postmaster is probably dead anyway.

Regards,
Andreas
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] case sensitivity in PQExecPrepared

2004-07-16 Thread Peter Eisentraut
Merlin Moncure wrote:
 While using the PQExecPrepared interface, the statement name passed
 to the function (as a const char*) has to be in lower case to get it
 to work.  I kind of understand why this is, but it is kind of weird
 that passing the exact same statement name to PQExecPrepared and
 PREPARE does not work if it contains any upper case characters.

PQExecPrepared is used in C programs, PREPARE is used in SQL programs.  
I think it's fairly obvious that those use different syntax rules.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] case sensitivity in PQExecPrepared

2004-07-16 Thread Merlin Moncure
Peter Eisentraut wrote:
 Merlin Moncure wrote:
  While using the PQExecPrepared interface, the statement name passed
  to the function (as a const char*) has to be in lower case to get it
  to work.  I kind of understand why this is, but it is kind of weird
  that passing the exact same statement name to PQExecPrepared and
  PREPARE does not work if it contains any upper case characters.
 
 PQExecPrepared is used in C programs, PREPARE is used in SQL programs.
 I think it's fairly obvious that those use different syntax rules.

Well, yes :)  Just to be absolutely clear what I mean, the following
will fail (pseudocode, but you get the idea):

char stmt[] = prepare X as select 0();
PQexec(c, execute X); -- works
PQexecPrepared(c, X [...]); -- fails

You are saying this is the correct and expected behavior?

Merlin


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] case sensitivity in PQExecPrepared

2004-07-16 Thread Peter Eisentraut
Merlin Moncure wrote:
 Well, yes :)  Just to be absolutely clear what I mean, the following
 will fail (pseudocode, but you get the idea):

 char stmt[] = prepare X as select 0();
 PQexec(c, execute X); -- works
 PQexecPrepared(c, X [...]); -- fails

 You are saying this is the correct and expected behavior?

Yes, because part of those syntax rules is that in SQL, unquoted 
identifiers are folded to lower case, but in C they are not.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
 Bruce Momjian wrote:
 - How's the official way to restrict pg_* functions to superuser only
  
  
  Very crudely  :-)
 
 Got it.
 
 'nother question: Is reading the logfile a task that may be allowed to 
 superusers only? I don't think so, though acls might apply.

Yes, the log file might contain SQL queries issued by others.  It is a
super-user only capability.

  Uh, that seems fine.  You already check to see it is within the limit. 
  I think a bigger question is should we limit it at all?  Do we limit
  pg_largeobject?  Is that similar?
 
 Ok, no limit (but a default maximum of 50k remains). And since it's 
 superuser only, he hopefully knows what he does.

Huh?  Why have a default maximum?

 Very open question:
 - How should a backend know the logger's pid if it's not in shmem. Write 
 a magic string to the pipe?
  
  
  I think it has to and in fact the pid is being written by the
  postmaster, not by the logger process, so that should be OK.  The issue
  is that the logger shouldn't _attach_ to shared memory unless it has to.
 
 It doesn't. It inherits the unnamed shared mem segment from the 
 postmaster, as all subprocesses.

Ah, I think it needs to close that as soon as it starts.  Don't other
subprocesses do that?  That shared memory is very fragile and we don't
want an errant pointer poking in there.

  As far as recording the current log timestamp, I think that will be a
  problem.  I would much rather see us forget about doing timestamp
  processing with these log files and keep it simple at this point and see
  what needs we have for 7.6.
 
 I'm a bit insisting on this point. Remember, this all started from the 
 attempt to display the serverlog on the client side. To do this, I need 
 a way to retrieve the current logfile properties (size, and in case of 
 rotation timestamp too) in a low-overhead way, or at least get to know 
 something has changed. Scanning a whole directory and interpreting the 
 data isn't low overhead any more.

This seems clean and fast enough to me:

SELECT filename
FROM pg_dir_ls('/var/log')
ORDER BY 1 DESC
LIMIT 1

Considering that any query from a client is going to have to go through
the parser and be executed, an 'ls' in a directory just isn't a
measurable performance hit.

If you want run a test that does an 'ls' and one that doesn't to see
that there is no measurable performance difference.

I would not worry about the clock going backward.  PostgreSQL would have
enough problems with timestamp columns moving backward that the file log
times are the least of our problems.

 There's no locking on the shmem, and the single dependence on shmem is 
 the existence of it at the time of rotation. If the shmem is gone, 
 postmaster is probably dead anyway.

You can't know that you aren't reading corrupt data if you read shared
memory without a lock.  What if the write is happening as you read?

The only clean solution I can think of is to write an operating system
file that contains the current log filename and read from that.  I
believe such writes are atomic.  But again, this seems like overkill to
me.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Nested Savepoints in Db2 Stinger

2004-07-16 Thread Alvaro Herrera
On Thu, Jul 15, 2004 at 08:41:08PM +0100, Simon Riggs wrote:
 FYI
 
 http://www.databasejournal.com/features/db2/article.php/3361941

Hmm.  Basically it says that they implement SQL2003 semantics (or the
other way around: SQL2003 was written with DB2 in mind).

In the code there is a provision for savepoint levels, but it doesn't do
anything (yet).

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede (Mark Twain)


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 19:30, Bruce Momjian wrote:
 Simon Riggs wrote:
  On Fri, 2004-07-16 at 16:58, Zeugswetter Andreas SB SD wrote:
 Do we need a checkpoint after the archiving
 starts but before the backup begins?

 No.

Actually yes.
   
   Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-(
   So yes, you need one checkpoint after archiving starts. Imho turning on xlog
   archiving should issue such a checkpoint just to be sure. 
   
  
  By agreement, archive_mode can only be turned on at postmaster startup,
  which means you always have a checkpoint - either because you shut it
  down cleanly, or you didn't and it recovers, then writes one.
  
  There is always something to start the rollforward. 
  
  So, non-issue.
 

I was discussing the claim that there might not be a checkpoint to begin
the rollforward from. There always is: if you are in archive_mode=true
then you will always have a checkpoint that can be used for recovery. It
may be a long way in the past, if there has been no write activity,
but the rollforward will very very quick, since there will be no log
records.

 I don't think so.  I can imagine many cases where you want to do a
 nightly tar backup without turning archiving on/off or restarting the
 postmaster.  

This is a misunderstanding. I strongly agree with what you say: the
whole system has been designed to avoid any benefit from turning on/off
archiving and there is no requirement to restart postmaster to take
backups.

 In those cases, a manual checkpoint would have to be issued
 before the backup begins.

A manual checkpoint doesn't HAVE TO be issued. Presumably most systems
will be running checkpoint every few minutes. Wherever the last one was
is where the rollforward would start from.

But you can if thats the way you want to do things, just wait long
enough for the checkpoint to have completed, otherwise your objective of
reducing rollforward time will not be met.

(please note my earlier reported rollback performance of approximately
x10 rate of recovery v elapsed time - will require testing on your own
systems).

 Imagine a system that is up for a month, and they don't have enough
 archive space to keep a months worth of WAL files.  They would probably
 do nightly or weekend tar backups, and then discard the WAL archives.
 

Yes, that would be normal practice. I would recommend keeping at least
the last 3 full backups and all of the WAL logs to cover that period.

 What procedure would they use?  I assume they would copy all their old
 WAL files to a save directory, issue a checkpoint, do a tar backup, then
 they can delete the saved WAL files.  Is that correct?

PITR is designed to interface with a wide range of systems, through the
extensible archive/recovery program interface. We shouldn't focus on
just tar backups - if you do, then the whole thing seems less
feature-rich. The current design allows interfacing with tape, remote
backup, internet backup providers, automated standby servers and the
dozen major storage/archive vendors' solutions.

Writing a procedure to backup, assign filenames, keep track of stuff
isn't too difficult if you're a competent DBA with a mild knowledge of
shell or perl scripting. But if data is important, people will want to
invest the time and trouble to adopt one of the open source or licenced
vendors that provide solutions in this area.

Systems management is a discipline and procedures should be in place for
everything. I fully agree with the automate everything dictum, but
just don't want to constrain people too much to a particular way of
doing things.

-o-o-

Overall, for first release, I think the complexity of this design is
acceptable. PITR is similar to Oracle7 Backup/Recovery, and easily
recognisable to any DBA with current experience of current SQL Server,
DB2 (MVS, UDB) or Teradata systems. [I can't comment much on Ingres,
Informix, Sybase etc]

My main areas of concern are:
- the formal correctness of the recovery process
As a result of this concern, PITR makes ZERO alterations to the recovery
code itself. The trick is to feed it the right xlog files and to stop,
if required, at the right place and allow normal work to resume.

- the robustness and quality of my implementation
This requires quality checking of the code and full beta testing

-o-o-

We've raised a couple of valid points on the lists in the last few days:
- its probably a desirable feature (but not essential) to implement a
write suspend feature on the bgwriter, if nothing else it will be a
confidence building feature...as said previously, for many people, this
will not be required, but people will no doubt keep asking
- there is a small window of risk around the possibility that a recovery
target might be set by the user that doesn't rollforward all the way
past the end of the backup. That is real, but in general, people aren't
likely to be performing archive recovery within minutes of a backup
being taken - 

Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Bruce Momjian wrote:
Ok, no limit (but a default maximum of 50k remains). And since it's 
superuser only, he hopefully knows what he does.

Huh?  Why have a default maximum?
Just for convenience. Both start and size are optional parameters, but 
with start=0 and size=5. Well, it's a very special function anyway, 
so we could require the user to supply all parameters. I'll remove it.


Ah, I think it needs to close that as soon as it starts.  Don't other
subprocesses do that?  That shared memory is very fragile and we don't
want an errant pointer poking in there.
The result of an errant pointer writing to that shred mem would be
1) wrong pid for SysLogger, so it can't be signalled to rotate from backends
2) wrong timestamp, so backends don't know the latest logfile.
Nothing particularly crash prone really.

This seems clean and fast enough to me:
SELECT filename
FROM pg_dir_ls('/var/log')
ORDER BY 1 DESC
LIMIT 1
For a logfile listing function, this would look
SELECT MAX(startdate)
  FROM pg_logfile_ls()

Considering that any query from a client is going to have to go through
the parser and be executed, an 'ls' in a directory just isn't a
measurable performance hit.
If you want run a test that does an 'ls' and one that doesn't to see
that there is no measurable performance difference.
So while a simple PG_RETURN_TIMESTAMP(logfiletimestamp) compared to a 
lengthy setof returning function is drastically faster, this appears 
much less drastic with parser overhead.

I would not worry about the clock going backward.  PostgreSQL would have
enough problems with timestamp columns moving backward that the file log
times are the least of our problems.
I see, so the admin is in trouble anyway (what about PITR? Data column 
deviations appear harmless compared to restoration based on timestamps).


You can't know that you aren't reading corrupt data if you read shared
memory without a lock.  What if the write is happening as you read?
I thought about this quite a while.
If the shmem fields aren't written atomically (one is 32bit, one 64 bit, 
probably on dword boundaries so writing will happen at least processor 
bus wide, do we support any 16 bit processor?) the corruption 
consequences as above apply. In the case of the timestamp, the high word 
will rarely change anyway, only every 2^32 seconds...

Concurrent access on the logger pid would mean to call 
pg_logfile_rotate() while a killed logger is being restarted, which is 
creating a new logfile then anyway. This would send a SIGINT into outer 
space, maybe to the bgwriter triggering a checkpoint, or the postmaster 
shutting it down (gracefully, still unwanted).

BTW, the consequences of a trigger flag in shmem would be less because 
all that could happen was a log rotation (which appends to existing 
files, just in case syslogger died in the milliseconds after a rotation).

The only clean solution I can think of is to write an operating system
file that contains the current log filename and read from that.  I
believe such writes are atomic.  But again, this seems like overkill to
me.
Ah wait.
Digging further behind SIGUSR1 I now *do* see a solution without pid in 
shmem, using SendPostmasterSignal. Well, a little hint from gurus would 
have helped...

I'll convert to this, *dropping* all shmem.
Regards,
Andreas
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Andreas Pflug wrote:
Ah wait.
Digging further behind SIGUSR1 I now *do* see a solution without pid in 
shmem, using SendPostmasterSignal. Well, a little hint from gurus would 
have helped...

Oops, SendPostmasterSignal uses shmem
At least, this enables syslogger.c to be free from shmem stuff, except 
for PGSharedMemDetach.

Regards,
Andreas
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
 Bruce Momjian wrote:
 Ok, no limit (but a default maximum of 50k remains). And since it's 
 superuser only, he hopefully knows what he does.
  
  
  Huh?  Why have a default maximum?
 
 Just for convenience. Both start and size are optional parameters, but 
 with start=0 and size=5. Well, it's a very special function anyway, 
 so we could require the user to supply all parameters. I'll remove it.

Agreed, and maybe a zero value gets the entire file.

  Ah, I think it needs to close that as soon as it starts.  Don't other
  subprocesses do that?  That shared memory is very fragile and we don't
  want an errant pointer poking in there.
 
 The result of an errant pointer writing to that shred mem would be
 
 1) wrong pid for SysLogger, so it can't be signalled to rotate from backends
 2) wrong timestamp, so backends don't know the latest logfile.
 
 Nothing particularly crash prone really.

No, I am thinking the program goes crazy and writes everywhere.

  This seems clean and fast enough to me:
  
  SELECT filename
  FROM pg_dir_ls('/var/log')
  ORDER BY 1 DESC
  LIMIT 1
 
 For a logfile listing function, this would look
 
 SELECT MAX(startdate)
FROM pg_logfile_ls()
 
 
  
  Considering that any query from a client is going to have to go through
  the parser and be executed, an 'ls' in a directory just isn't a
  measurable performance hit.
  
  If you want run a test that does an 'ls' and one that doesn't to see
  that there is no measurable performance difference.
  
 So while a simple PG_RETURN_TIMESTAMP(logfiletimestamp) compared to a 
 lengthy setof returning function is drastically faster, this appears 
 much less drastic with parser overhead.
 
  I would not worry about the clock going backward.  PostgreSQL would have
  enough problems with timestamp columns moving backward that the file log
  times are the least of our problems.
 
 I see, so the admin is in trouble anyway (what about PITR? Data column 
 deviations appear harmless compared to restoration based on timestamps).

PITR uses WAL numbering so it would be fine, but the timestamps on the
commit records would have problems.

  You can't know that you aren't reading corrupt data if you read shared
  memory without a lock.  What if the write is happening as you read?
 
 I thought about this quite a while.
 
 If the shmem fields aren't written atomically (one is 32bit, one 64 bit, 
 probably on dword boundaries so writing will happen at least processor 
 bus wide, do we support any 16 bit processor?) the corruption 
 consequences as above apply. In the case of the timestamp, the high word 
 will rarely change anyway, only every 2^32 seconds...
 
 Concurrent access on the logger pid would mean to call 
 pg_logfile_rotate() while a killed logger is being restarted, which is 
 creating a new logfile then anyway. This would send a SIGINT into outer 
 space, maybe to the bgwriter triggering a checkpoint, or the postmaster 
 shutting it down (gracefully, still unwanted).
 
 BTW, the consequences of a trigger flag in shmem would be less because 
 all that could happen was a log rotation (which appends to existing 
 files, just in case syslogger died in the milliseconds after a rotation).
 
  
  The only clean solution I can think of is to write an operating system
  file that contains the current log filename and read from that.  I
  believe such writes are atomic.  But again, this seems like overkill to
  me.
 
 Ah wait.
 Digging further behind SIGUSR1 I now *do* see a solution without pid in 
 shmem, using SendPostmasterSignal. Well, a little hint from gurus would 
 have helped...
 
 I'll convert to this, *dropping* all shmem.

Yes, that is the usual method.  We signal the postmaster and it then
does the signalling to the logger.  I thought you had looked at other
backend signalling examples so I didn't explain it.

Now, one really good efficiency would be to use LISTEN/NOTIFY so clients
could know when new data has appeared in the log, or the log file is
rotated.  Now that's an efficiency!   However, let's get this
infrastructure completed first.   One wacky idea would be for the
clients to LISTEN on 'pg_new_logfile' and have the logger do
system('psql -c NOTIFY pg_new_logfile template1') or something like
that.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
 Andreas Pflug wrote:
 
  
  Ah wait.
  Digging further behind SIGUSR1 I now *do* see a solution without pid in 
  shmem, using SendPostmasterSignal. Well, a little hint from gurus would 
  have helped...
  
 
 Oops, SendPostmasterSignal uses shmem
 
 At least, this enables syslogger.c to be free from shmem stuff, except 
 for PGSharedMemDetach.

Right.  We already have to use shared mem for the backends and
postmaster.  It is the logger we are worried about.

Tom brought up the point that if the logger used shared memory, we would
have to kill/restart it if we need to reinitialize shared memory,
meaning we would loose logging info at a time we really need it ---
again a good reason not to use shared memory in the logger.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 16:47, Tom Lane wrote:
 As far as the business about copying pg_control first goes: there is
 another way to think about it, which is to copy pg_control to another
 place that will be included in your backup.  For example the standard
 backup procedure could be
 
 1. [somewhat optional] Issue CHECKPOINT and wait till it finishes.
 
 2. cp $PGDATA/global/pg_control $PGDATA/pg_control.dump
 
 3. tar cf /dev/mt $PGDATA
 
 4. do something to record ending WAL position
 
 If we standardized on this way, then the tar archive would automatically
 contain the pre-backup checkpoint position in ./pg_control.dump, and
 there is no need for any special assumptions about the order in which
 tar processes things.
 

Sounds good. That would be familiar to Oracle DBAs doing BACKUP
CONTROLFILE. We can document that and off it as a suggested procedure.

 However, once you decide to do things like that, there is no reason why
 the copied file has to be an exact image of pg_control.  I claim it
 would be more useful if the copied file were plain text so that you
 could just cat it to find out the starting WAL position; that would
 let you determine without any special tools what range of WAL archive
 files you are going to need to bring back from your archives.

I wouldn't be in favour of a manual mechanism. If you want an automated
mechanism, whats wrong with using the one thats already there? You can
use pg_controldata to read the controlfile, again whats wrong with that?

We agreed some time back that an off-line xlog file inspector would be
required to allow us to inspect the logs and make a decision about where
to end recovery. You'd still need that.

It's scary enough having to specify the end point, let alone having to
specify the starting point as well.

At your request, and with Bruce's idea, I designed and built the
recovery system so that you don't need to know what range of xlogs to
bring back. You just run it, it brings back the right files from archive
and does recovery with them, then cleans up - and it works without
running out of disk space on long recoveries.

I've built it now and it works...

 This is pretty much the same chain of reasoning that Bruce and I went
 through yesterday to come up with the idea of putting a label file
 inside the tar backups.  We concluded that it'd be worth putting
 both the backup starting time and the checkpoint WAL position into
 the label file --- the starting time isn't needed for restore but
 might be really helpful as documentation, if you needed to verify
 which dump file was which.

...if you are doing tar backups...what will you do if you're not using
that mechanism?

If you are: It's common practice to make up a backup filename from
elements such as systemname, databasename, date and time etc. That gives
you the start time, the file last mod date gives you the end time. 

I think its perfectly fine for everybody to do backups any way they
please. There are many licenced variants of PostgreSQL and it might be
appropriate in those to specify particular ways of doing things.

I'll be trusting the management of backup metadata and storage media to
a solution designed for the purpose (open or closed source), just as
I'll be trusting my data to a database solution designed for that
purpose. That for me is one of the good things about PostgreSQL - we use
the filesystem, we don't write our own, we provide language interfaces
not invent our own proprietary server language etc..

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Bruce Momjian
Andreas Pflug wrote:
 Bruce Momjian wrote:
  Andreas Pflug wrote:
 
  Right.  We already have to use shared mem for the backends and
  postmaster.  It is the logger we are worried about.
  
  Tom brought up the point that if the logger used shared memory, we would
  have to kill/restart it if we need to reinitialize shared memory,
 
 I don't know why that particular segment should ever be renewed. Anyway, 
 it's gone.

As I remember, we have one big shared memory segment.  Where you
creating a special one just for this timestamp?  If you were, I see why
your approach was safer, but as you said, it doesn't buy us much anyway.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Bruce Momjian wrote:
Andreas Pflug wrote:

Right.  We already have to use shared mem for the backends and
postmaster.  It is the logger we are worried about.
Tom brought up the point that if the logger used shared memory, we would
have to kill/restart it if we need to reinitialize shared memory,
I don't know why that particular segment should ever be renewed. Anyway, 
it's gone.

Regards,
Andreas
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] serverlog rotation/functions

2004-07-16 Thread Andreas Pflug
Bruce Momjian wrote:
Andreas Pflug wrote:
Just for convenience. Both start and size are optional parameters, but 
with start=0 and size=5. Well, it's a very special function anyway, 
so we could require the user to supply all parameters. I'll remove it.

Agreed, and maybe a zero value gets the entire file.
Which is a default param back again, maybe on a 100MB file? Better not. 
Lets leave it to the admin to do sick stuff as 
pg_read_file('base/5000/5002', 0, 1) ...

No, I am thinking the program goes crazy and writes everywhere.
What I described was just that situation.

Yes, that is the usual method.  We signal the postmaster and it then
does the signalling to the logger.  I thought you had looked at other
backend signalling examples so I didn't explain it.
Well if you know the places where backends do signal stuff to the 
postmaster... Still, somebody could have yelled use the standard way 
before reinventing the wheel.

Now, one really good efficiency would be to use LISTEN/NOTIFY so clients
could know when new data has appeared in the log, or the log file is
rotated.  Now that's an efficiency!   However, let's get this
infrastructure completed first.   One wacky idea would be for the
clients to LISTEN on 'pg_new_logfile' and have the logger do
system('psql -c NOTIFY pg_new_logfile template1') or something like
that.
No, certainly not. This would mean that every time a log is done, psql 
is fired up. Tom wouldn't accept this as KISS, I believe. And h*ll, that 
would cause traffic (just imagine a single log message on client startup...)

What you saw on LinuxTag was pgAdmin3 polling once a second if the 
logfile length changed, which is the fastest setting possible.

Regards,
Andreas


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian

OK, I think I have some solid ideas and reasons for them.

First, I think we need server-side functions to call when we start/stop
the backup.  The advantage of these server-side functions is that they
will do the required work of recording the pg_control values and
creating needed files with little chance for user error.  It also allows
us to change the internal operations in later releases without requiring
admins to change their procedures.  We are even able to adjust the
internal operation in minor releases without forcing a new procedure on
users.

Second, I think once we start a restore, we should rename recovery.conf
to recovery.in_progress, and when complete rename that to
recovery.done.  If the postmaster starts and sees recovery.in_progress,
it will fail to start knowing its recovery was interrupted.  This allows
the admin to take appropriate action.  (I am not sure what that action
would be. Does he bring back the backup files or just keep going?)

Third, I think we need to put a file in the archive location once we
complete a backup, recording the start/stop xid and wal/offsets.  This
gives the admin documentation on what archive logs to keep and what xids
are available for recovery.  Ideally the recover program would read that
file and check the recover xid to make sure it is after the stop xid
recorded in the file.

How would the recover program know the name of that file?  We need to
create it in /data with start contents before the backup, then complete
it with end contents and archive it.

What should we name it?  Ideally it would be named by the WAL
name/offset of the start so it orders in the proper spot in the archive
file listing, e.g.:

093a
093b
093b.032b9.start
093c

Are people going to know they need 093b for
093b.032b9.start?  I hope so.  Another idea is to do:


093a.xlog
093b.032b9.start
093b.xlog
093c.xlog

This would order properly.  It might be a very good idea to add
extensions to these log files now that we are archiving them in strange
places.  In fact, maybe we should use *.pg_xlog to document the
directory they came from.

---


Simon Riggs wrote:
 On Fri, 2004-07-16 at 16:47, Tom Lane wrote:
  As far as the business about copying pg_control first goes: there is
  another way to think about it, which is to copy pg_control to another
  place that will be included in your backup.  For example the standard
  backup procedure could be
  
  1. [somewhat optional] Issue CHECKPOINT and wait till it finishes.
  
  2. cp $PGDATA/global/pg_control $PGDATA/pg_control.dump
  
  3. tar cf /dev/mt $PGDATA
  
  4. do something to record ending WAL position
  
  If we standardized on this way, then the tar archive would automatically
  contain the pre-backup checkpoint position in ./pg_control.dump, and
  there is no need for any special assumptions about the order in which
  tar processes things.
  
 
 Sounds good. That would be familiar to Oracle DBAs doing BACKUP
 CONTROLFILE. We can document that and off it as a suggested procedure.
 
  However, once you decide to do things like that, there is no reason why
  the copied file has to be an exact image of pg_control.  I claim it
  would be more useful if the copied file were plain text so that you
  could just cat it to find out the starting WAL position; that would
  let you determine without any special tools what range of WAL archive
  files you are going to need to bring back from your archives.
 
 I wouldn't be in favour of a manual mechanism. If you want an automated
 mechanism, whats wrong with using the one thats already there? You can
 use pg_controldata to read the controlfile, again whats wrong with that?
 
 We agreed some time back that an off-line xlog file inspector would be
 required to allow us to inspect the logs and make a decision about where
 to end recovery. You'd still need that.
 
 It's scary enough having to specify the end point, let alone having to
 specify the starting point as well.
 
 At your request, and with Bruce's idea, I designed and built the
 recovery system so that you don't need to know what range of xlogs to
 bring back. You just run it, it brings back the right files from archive
 and does recovery with them, then cleans up - and it works without
 running out of disk space on long recoveries.
 
 I've built it now and it works...
 
  This is pretty much the same chain of reasoning that Bruce and I went
  through yesterday to come up with the idea of putting a label file
  inside the tar backups.  We concluded that it'd be worth putting
  both the backup starting time and the checkpoint WAL position into
  the label file --- the starting time isn't needed for restore but
  might be really helpful as documentation, if you needed to verify
  which 

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian

Let me address you concerns about PITR getting into 7.5. I think a few
people spoke last week expressing concern about our release process and
wanting to take drastic action.  However, looking at the release status
report I am about to post, you will see we are on track for an August 1
beta.

PITR has been neglected only because it has been moving along so well we
haven't needed to get deeply involved.  Simon has been able to address
concerns as we raised them and make adjustments quickly with little
guidance.  

Now, we certainly don't want to skip adding PITR by not giving it our
full attention to get into 7.5.  Once Tom completes the cursor issues
with NT in the next day or so,  I think that removes the last big NT
stumbling block, and we will start to focus on PITR.  Unless there is
some major thing we are missing, we fully expect to get PITR in 7.5.  We
don't have a crystal ball to know for sure, but our intent is clear.

I know Simon is going away July 26 so we want to get him feedback as
soon as possible.  If we wait until after July 26, we will have to make
all the adjustments without Simon's guidance, which will be difficult.

As far as the importance of PITR, it is a _key_ enterprise feature, even
more key than NT.  PITR is going to be one of the crowning jewels of the
7.5 release, and I don't want to go into beta without it unless we can't
help it.

So, I know with the deadline looming, and everyone it getting nervous,
but keep the faith.  I can see the light at the end of the tunnel.  I
know this is a tighter schedule than we would like, but I know we can do
it, and I expect we will do it.

---

Simon Riggs wrote:
 On Thu, 2004-07-15 at 15:57, Bruce Momjian wrote:
 
  We will get there --- it just seems dark at this time.
 
 Thanks for that. My comments were heartfelt, but not useful right now. 
 
 I'm badly overdrawn already on my time budget, though that is my concern
 alone. There is more to do than I have time for. Pragmatically, if we
 aren't going to get there then I need to stop now, so I can progress
 other outstanding issues. All help is appreciated.
 
 I'm aiming for the minimum feature set - which means we do need to take
 care over whether that set is insufficient and also to pull any part
 that doesn't stand up to close scrutiny over the next few days.
 
 Overall, my primary goal is increased robustness and availability for
 PostgreSQL...and then to have a rest!
 
 Best Regards, Simon Riggs
 
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] Release status report

2004-07-16 Thread Bruce Momjian
We are two weeks away from the beta freeze on August 1 so I want to give
a status report on where we are.

Right now we have two large outstanding items, and a few smaller ones. 
Nested transactions (NT) is in CVS, but it is missing:

o  savepoints
o  cursor rollback
o  performance fixes

Alvaro has completed the first item and a few more and will probably
submit a mega-patch shortly that can be applied.  Tom is working on
cursors and should finish in a day or two.  Performance fixes are needed
and will have to be addressed, though we might be adjusting these during
beta as we find them.

PITR has been ready for a while though Simon continues to improve it and
users are continuing to suggest fixes/ideas.  I am hopeful we can get
the majority of PITR issues/comments addressed next week while Simon is
available.

The other patches are in-progress and will be applied when the community
concludes requesting changes to each.

So, basically, I think we are on track.  It would be nice to get the
majority of NT/PITR work into CVS by next weekend so we can have a week
of polishing before beta.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html