Re: [mail] Re: [HACKERS] Windows Build System

Lamar Owen Thu, 30 Jan 2003 14:30:49 -0800

On Thursday 30 January 2003 16:54, Tom Lane wrote:
> Lamar Owen <[EMAIL PROTECTED]> writes:
> > And, by the way, who in their right mind tests a database server by
> > repeated yanking of the AC power?


> Anybody who would like their data to survive a power outage.

I don't buy that.  That's why I have $36,000 worth of lead acid in the room 
next door, with $5,000 of inverters and chargers in the server room.  Until I 
had to upgrade RAM I had 240+ days of uptime on one box.  The longest power 
interruption was 28 hours.  The battery held the whole time.  There was never 
more than 30 days between interruptions.  The last time I had the server 
actually power down was during a maintenance run on the inverter/charge 
system, and I had to transfer power to the servers onto another branch, 
necessitating two power cycles, which were clean shutdown/reboots.  I haven't 
had an unscheduled dirty powerdown in two years.

We cannot on any system guarantee the data surviving a sudden power outage. 
Until we can be certain the write-back cache on that high performance drive 
(or NAS array using iSCSI, perhaps) flushes we cannot know the data hit the 
disks.

> >  To go to that extreme for Win32 when we caution
> > against something as mundane as a kill -9 of postmaster on Unix is
> > absurd. And, yes, I know the difference.  I also know that the AC power
> > pull has nothing to do with PostgreSQL, but it has to do with the OS
> > under it. Although a kill -9, from the point of view of the running
> > process, is identical to a power failure.

> No, it is not.  Did you not read my comments earlier today?

Of course I did -- I'm not daft.  And that's why I specified 'from the point 
of view of the running process' -- that is, the process you are SIGKILLing 
cannot itself determine the difference between the power cycle and SIGKILL.  
It just simply goes down, hard.  Of course there is:

> I forgot to mention one of the biggest
> headaches, which is that kill -9 the postmaster doesn't kill the child
> backends.

This is a real difference, and one that I forgot as well. So SIGKILL is 
different to the whole backend system, but not to the singular process that 
is being SIGKILL'd.  Suppose I issue a SIGKILL to postmaster and all forked 
backends simultaneously?  Where does SIGKILL differ from a power failure from 
the point of view of the database system in that scenario?  This is also 
assuming that you clean reboot the OS after the SIGKILL to postmaster, as 
there is that dynamic state you mentioned to worry about.  I probably should 
have mentioned that before.

> Windows
> is going to bring a whole new set of failure modes that we don't have
> defenses for.  (Yet.)  *That* is what we need extensive testing to learn
> about, and claiming that we are discriminating against Windows just
> because it's Windows misses the point completely.

And ISTM that an experienced Windows developer, such as Katie or Dave, would 
know to do this, would know how to do this, and would know the best way of 
doing this.  And I wasn't singling you out, Tom.  It was the whole thread and 
the turns it took that got me rather upset. 

> Or, if you prefer, we can ship Postgres 7.4 for Windows with no more
> testing than we need for any of the existing, long-since-well-tested
> ports.  But I'll bet a great deal that our reputation will go down the
> drain (along with many people's data) if we do that.

We don't have a standard testing methodology for any of our ports.  We need 
one for all of our ports.  I fully expect the Win32 port to need a different 
methodology than the FreeBSD port or the Linux port.  And I expect we have 
enough experienced Win32 developers (which I am not) here that can provide 
insight into how the methodologies should differ.

I prefer more extensive testing for all of our ports.  You did read that when 
I wrote it, right?  (When I wrote it multiple times....)  Just saying 'it 
passed regression' shouldn't be enough -- but we should really spend some 
cycles thinking about what the test suite really should be.  For all 
platforms.
-- 
Lamar Owen
WGCR Internet Radio
1 Peter 4:11


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [mail] Re: [HACKERS] Windows Build System

Reply via email to