Re: [GENERAL] Problem with starting PostgreSQL server 7.4.19

2008-03-12 Thread Tom Lane
Craig Ringer [EMAIL PROTECTED] writes:
 Kakoli Sen wrote:
 It was running fine initially and the database was lying idle for a
 few days. Today I looged into the machine and restarted the server by
 killing the process by 'kill -9 pid'. And then restarted it by
 'postmaster -i -D /opt/pgsql/data/'.
 
 Why did you use `kill -9' ?

Certainly not good practice, but theoretically PG should be proof
against even such deliberate abuse as that.

What seemed odd to me was

 LOG:  database system was interrupted at 2008-03-06 14:15:17 IST
 LOG:  record with incorrect prev-link 1/0 at 0/A4EB08
 LOG:  invalid primary checkpoint record
 LOG:  record with incorrect prev-link 42FD/0 at 0/A4EAC8
 LOG:  invalid secondary checkpoint record

Experimentation shows that a freshly initialized 7.4 database has
WAL locations like this:

Latest checkpoint location:   0/9DFCF0
Prior checkpoint location:0/9D92C0

so either you'd only ever thrown a few kilobytes of stuff into the DB
or there was something seriously wrong with pg_control to begin with.
I'm wondering about mistaken filesystem restores ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with starting PostgreSQL server 7.4.19

2008-03-12 Thread Kakoli Sen
Hi,
Actually, I tried stopping server by 'kill `cat
/opt/pgsql/data/postmaster.pid`. This did not work. So I used kill -9 on Red
Hat 4.

This is a test database where we are in the process of setting up. So it
does not have live data. Still I do agree, it was not a good idea.

Now, do I have to re-install PostgreSQL or is there any way out?

Server configuration is default. Only change from default is allowing tcp/ip
connections.

Regards,

Kakoli




 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Craig Ringer
 Sent: Wednesday, March 12, 2008 11:09 AM
 To: Kakoli Sen
 Cc: pgsql-general@postgresql.org
 Subject: Re: [GENERAL] Problem with starting PostgreSQL server 7.4.19


 Kakoli Sen wrote:
  Hello all,
  It was running fine initially and the database was
 lying idle for a
  few days. Today I looged into the machine and restarted the server by
  killing the process by 'kill -9 pid'. And then restarted it by
  'postmaster -i -D /opt/pgsql/data/'.
 
 Why did you use `kill -9' ? Was it not responding to `kill -15' ( ie
 SIGTERM, kill -TERM ) or shutdown using the init script?

 SIGKILL, ie signal 9, terminates the process without giving it a chance
 to clean its state up. It gets no chance to write out buffered data,
 mark data files as clean, or take any other safe shutdown actions. It's
 a REALLY REALLY BAD IDEA to do this on a database server, though it
 should still be able to recover if it's configured to operate with fsync
 enabled etc.
  Then it gives the following error on stdout :
 
  LOG:  database system was interrupted at 2008-03-06 14:15:17 IST
  LOG:  record with incorrect prev-link 1/0 at 0/A4EB08
  LOG:  invalid primary checkpoint record
  LOG:  record with incorrect prev-link 42FD/0 at 0/A4EAC8
  LOG:  invalid secondary checkpoint record
  PANIC:  could not locate a valid checkpoint record
 Ouch. It can't handle either of the checkpoints, and so it can't load
 the database.

 I don't know what database repair tools exist, but personally at this
 point I'd be glad my backups are always kept up to date.
  What is the problem? It was running fine all this time.
 
 I suspect that killing it without giving it a chance to do any cleanup
 operations might not have helped.

 What's your server configuration? Could you have disabled any safe I/O
 options to get some more speed out of the database, perhaps?

 I'm pretty sure 8.x copes with SIGKILL (because of its use of WAL
 logging, strong fsync requirements, etc) though of course it's still not
 a good idea. I don't know about 7.x .

 --
 Craig Ringer

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

 --
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: [GENERAL] Problem with starting PostgreSQL server 7.4.19

2008-03-12 Thread Craig Ringer

Kakoli Sen wrote:

Hi,
Actually, I tried stopping server by 'kill `cat
/opt/pgsql/data/postmaster.pid`. This did not work. So I used kill -9 on Red
Hat 4.

This is a test database where we are in the process of setting up. So it
does not have live data. Still I do agree, it was not a good idea.

Now, do I have to re-install PostgreSQL or is there any way out?


Well, if it's a test database you should be able to rename or remove the 
data directory and then re-run initdb, since you don't care about the 
data in the database.


I'm sure you want to find out why this happened, though, so maybe you 
should keep the damaged database around for a while and see if anybody 
here has ideas about what could've happened.


--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with starting PostgreSQL server 7.4.19

2008-03-12 Thread Tom Lane
Kakoli Sen [EMAIL PROTECTED] writes:
 Actually, I tried stopping server by 'kill `cat
 /opt/pgsql/data/postmaster.pid`. This did not work. So I used kill -9 on Red
 Hat 4.

Define did not work ... what happened exactly?

I do not know of any prepackaged Postgres distribution for Red Hat that
would put the data directory under /opt.  Did you build from source?
If you used a prepackaged build then I'm thinking that you did not find
every place that needed to be changed to move the data directory.

I'm also kind of wondering why you are using either PG 7.4 or RH 4
for a new experimental setup.  Both of those versions can see their
EOL dates coming round the corner.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with starting PostgreSQL server 7.4.19

2008-03-11 Thread Craig Ringer

Kakoli Sen wrote:

Hello all,
It was running fine initially and the database was lying idle for a
few days. Today I looged into the machine and restarted the server by
killing the process by 'kill -9 pid'. And then restarted it by
'postmaster -i -D /opt/pgsql/data/'.
  
Why did you use `kill -9' ? Was it not responding to `kill -15' ( ie 
SIGTERM, kill -TERM ) or shutdown using the init script?


SIGKILL, ie signal 9, terminates the process without giving it a chance 
to clean its state up. It gets no chance to write out buffered data, 
mark data files as clean, or take any other safe shutdown actions. It's 
a REALLY REALLY BAD IDEA to do this on a database server, though it 
should still be able to recover if it's configured to operate with fsync 
enabled etc.

Then it gives the following error on stdout :

LOG:  database system was interrupted at 2008-03-06 14:15:17 IST
LOG:  record with incorrect prev-link 1/0 at 0/A4EB08
LOG:  invalid primary checkpoint record
LOG:  record with incorrect prev-link 42FD/0 at 0/A4EAC8
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
Ouch. It can't handle either of the checkpoints, and so it can't load 
the database.


I don't know what database repair tools exist, but personally at this 
point I'd be glad my backups are always kept up to date.

What is the problem? It was running fine all this time.
  
I suspect that killing it without giving it a chance to do any cleanup 
operations might not have helped.


What's your server configuration? Could you have disabled any safe I/O 
options to get some more speed out of the database, perhaps?


I'm pretty sure 8.x copes with SIGKILL (because of its use of WAL 
logging, strong fsync requirements, etc) though of course it's still not 
a good idea. I don't know about 7.x .


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general