Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-12 Thread Bruno Wolff III
On Thu, May 11, 2006 at 18:41:25 -0500,
  Jim C. Nasby [EMAIL PROTECTED] wrote:
 On Thu, May 11, 2006 at 07:20:27PM -0400, Bruce Momjian wrote:
 
 My damn powerbook drive recently failed with very little warning, other
 than I did notice that disk activity seemed to be getting a bit slower.
 IIRC it didn't log any errors or anything. Even if it did, if the OS was
 catching them I'd hope it would pop up a warning or something. But from
 what I've heard, some drives now-a-days will silently remap dead sectors
 without telling the OS anything, which is great until you've used up all
 of the spare sectors and there's nowhere to remap to. :(

You might look into smartmontools. One part of this is a daemon that runs
selftests on the disks on a regular basis. You can have warnings mailed to
you on various conditions. Drives will fail the self test before they
run out of spare sectors. There are other drive characteristics that can
be used to tell if drive failure is imminent and give you a chance to replace
a drive before it fails.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-11 Thread Jim C. Nasby
On Tue, May 09, 2006 at 12:10:32PM +0200, Jean-Yves F. Barbier wrote:
  I myself can't see much reason to spend $500 on high end controller
  cards for a simple Raid 1.
 
 Naa, you can find ATA | SATA ctrlrs for about EUR30 !
 
And you're likely getting what you paid for: crap. Such a controller is
less likely to do things like turn of write caching so that fsync works
properly.

  + Hardware Raids might be a bit easier to manage, if you never spend a
  few hours to learn Software Raid Tools.
 
 I'd the same (mostly as you still have to punch a command line for
 most of the controlers)
 
Controllers I've seen have some kind of easy to understand GUI, at least
during bootup. When it comes to OS-level tools that's going to vary
widely.

  + There are situations in which Software Raids are faster, as CPU power
  has advanced dramatically in the last years and even high end controller
  cards cannot keep up with that.
 
 Definitely NOT, however if your server doen't have a heavy load, the
 software overload can't be noticed (essentially cache managing and
 syncing)
 
 For bi-core CPUs, it might be true

Depends. RAID performance depends on a heck of a lot more than just CPU.
Software RAID allows you to do things like spread load across multiple
controllers, so you can scale a lot higher for less money. Though in
this case I doubt that's a consideration, so what's more important is
that making sure the controller bus isn't in the way. One thing that
means is ensuring that every SATA drive has it's own dedicated
controller, since a lot of SATA hardware can't handle multiple commands
on the bus at once.

  + Using SATA drives is always a bit of risk, as some drives are lying
  about whether they are caching or not.
 
 ?? Do you intend to use your server without a UPS ??

Have you never heard of someone tripping over a plug? Or a power supply
failing? Or the OS crashing? If fsync is properly obeyed, PostgreSQL
will gracefully recover from all of those situations. If it's not,
you're at risk of losing the whole database.

  + Using hardware controllers, the array becomes locked to a particular
  vendor. You can't switch controller vendors as the array meta
  information is stored proprietary. In case the Raid is broken to a level
  the controller can't recover automatically this might complicate manual
  recovery by specialists.
 
 ?? Do you intend not to make backups ??

Even with backups this is still a valid concern, since the backup will
be nowhere near as up-to-date as the database was unless you have a
pretty low DML rate.

 BUT a hardware controler is about EUR2000 and a (ATA/SATA) 500GB HD
 is ~ EUR350.

Huh? You can get 3ware controllers for about $500, and they're pretty
decent. While I'm sure there are controllers for $2k that doesn't mean
there's nothing inbetween that and nothing.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Douglas McNaught
Greg Stark [EMAIL PROTECTED] writes:

 Douglas McNaught [EMAIL PROTECTED] writes:

 Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive.

 Well, dollar for dollar you would get the best performance from slower drives
 anyways since it would give you more spindles. 15kRPM drives are *expensive*.

Depends on your power, heat and rack space budget too...  If you need
max performance out of a given rack space (rather than max density),
SCSI is still the way to go.  I'll definitely agree that SATA is
becoming much more of a player in the server storage market, though.

-Doug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Markus Schaber
Hi, Scott  all,

Scott Lamb wrote:

 I don't know the answer to this question, but have you seen this tool?
 
 http://brad.livejournal.com/2116715.html

We had a simpler tool inhouse, which wrote a file byte-for-byte, and
called fsync() after every byte.

If the number of fsyncs/min is higher than your rotations per minute
value of your disks, they must be lying.

It does not find as much liers as the script above, but it is less
intrusive (can be ran on every low-io machine without crashing it), and
it found some liers in-house (some notebook disks, one external
USB/FireWire to IDE case, and an older linux cryptoloop implementations,
IIRC).

If you're interested, I can dig for the C source...

HTH,
Markus




-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Bruce Momjian
Markus Schaber wrote:
 Hi, Scott  all,
 
 Scott Lamb wrote:
 
  I don't know the answer to this question, but have you seen this tool?
  
  http://brad.livejournal.com/2116715.html
 
 We had a simpler tool inhouse, which wrote a file byte-for-byte, and
 called fsync() after every byte.
 
 If the number of fsyncs/min is higher than your rotations per minute
 value of your disks, they must be lying.
 
 It does not find as much liers as the script above, but it is less

Why does it find fewer liers?

---

 intrusive (can be ran on every low-io machine without crashing it), and
 it found some liers in-house (some notebook disks, one external
 USB/FireWire to IDE case, and an older linux cryptoloop implementations,
 IIRC).
 
 If you're interested, I can dig for the C source...
 
 HTH,
 Markus
 
 
 
 
 -- 
 Markus Schaber | Logical TrackingTracing International AG
 Dipl. Inf. | Software Development GIS
 
 Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Vivek Khera


On May 10, 2006, at 12:41 AM, Greg Stark wrote:

Well, dollar for dollar you would get the best performance from  
slower drives
anyways since it would give you more spindles. 15kRPM drives are  
*expensive*.


Personally, I don't care that much for dollar for dollar I just  
need performance.  If it is within a factor of 2 or 3 in price then  
I'll go for absolute performance over bang for the buck.




smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Bruce Momjian
Vivek Khera wrote:
 
 On May 10, 2006, at 12:41 AM, Greg Stark wrote:
 
  Well, dollar for dollar you would get the best performance from  
  slower drives
  anyways since it would give you more spindles. 15kRPM drives are  
  *expensive*.
 
 Personally, I don't care that much for dollar for dollar I just  
 need performance.  If it is within a factor of 2 or 3 in price then  
 I'll go for absolute performance over bang for the buck.

That is really the issue.  You can buy lots of consumer-grade stuff and
work just fine if your performance/reliability tolerance is high enough.

However, don't fool yourself that consumer and server-grade hardware is
internally the same, or has the same testing.

I just had a Toshiba laptop drive replaced last week (new, not
refurbished), only to have it fail this week.  Obviously there isn't
sufficient burn-in done by Toshiba, and I don't fault them because it is
a consumer laptop --- it fails, they replace it.  For servers, the
downtime usually can't be tolerated, while consumers usually can
tolerate significant downtime.

I have always purchased server-grade hardware for my home server, and I
think I have had one day of hardware downtime in the past ten years. 
Consumer hardware just couldn't do that.

As one data point, most consumer-grade IDE drives are designed to be run
only 8 hours a day.  The engineering doesn't anticipate 24-hour
operation, and that trade-off passes all the way through the selection
of componients for the drive, which generates sigificant cost savings.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Markus Schaber
Hi, Bruce,

Bruce Momjian wrote:


It does not find as much liers as the script above, but it is less
 
 Why does it find fewer liers?

It won't find liers that have a small lie-queue-length so their
internal buffers get full so they have to block. After a small burst at
start which usually hides in other latencies, they don't get more
throughput than spindle turns.

It won't find liers that first acknowledge to the host, and then
immediately write the block before accepting other commands. This
improves latency (which is measured in some benchmarks), but not
syncs/write rate.

Both of them can be captured by the other script, but not by my tool.

HTH,
Markus


-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Scott Marlowe
On Tue, 2006-05-09 at 20:02, Bruce Momjian wrote:
 Scott Marlowe wrote:
  Actually, in the case of the Escalades at least, the answer is yes. 
  Last year (maybe a bit more) someone was testing an IDE escalade
  controller with drives that were known to lie, and it passed the power
  plug pull test repeatedly.  Apparently, the escalades tell the drives to
  turn off their cache.  While most all IDEs and a fair number of SATA
  drives lie about cache fsyncing, they all seem to turn off the cache
  when you ask.
  
  And, since a hardware RAID controller with bbu cache has its own cache,
  it's not like it really needs the one on the drives anyway.
 
 You do if the controller thinks the data is already on the drives and
 removes it from its cache.

Bruce, re-read what I wrote.  The escalades tell the drives to TURN OFF
THEIR OWN CACHE.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Douglas McNaught
Scott Marlowe [EMAIL PROTECTED] writes:

 On Tue, 2006-05-09 at 20:02, Bruce Momjian wrote:

 You do if the controller thinks the data is already on the drives and
 removes it from its cache.

 Bruce, re-read what I wrote.  The escalades tell the drives to TURN OFF
 THEIR OWN CACHE.

Some ATA drives would lie about that too IIRC.  Hopefully they've
stopped doing it in the SATA era.

-Doug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Markus Schaber
Hi, Bruce,

Markus Schaber wrote:

It does not find as much liers as the script above, but it is less
Why does it find fewer liers?
 
 It won't find liers that have a small lie-queue-length so their
 internal buffers get full so they have to block. After a small burst at
 start which usually hides in other latencies, they don't get more
 throughput than spindle turns.

I just reread my mail, and must admit that I would not understand what I
wrote above, so I'll explain a little more:

My test programs writes byte-for-byte. Let's say our FS/OS has 4k page-
and blocksize, that means 4096 writes that all write the same disk blocks.

Intelligent liers will see that the the 2nd and all further writes
obsolete the former writes who still reside in the internal cache, and
drop those former writes from cache, effectively going up to 4k
writes/spindle turn.

Dumb liers will keep the obsolete writes in the write cache / queue, and
so won't be caught by my program. (Note that I have no proof that such
disks actually exist, but I have enough experience with hardware that I
won't be surprised.)


HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-10 Thread Scott Marlowe
On Wed, 2006-05-10 at 09:51, Douglas McNaught wrote:
 Scott Marlowe [EMAIL PROTECTED] writes:
 
  On Tue, 2006-05-09 at 20:02, Bruce Momjian wrote:
 
  You do if the controller thinks the data is already on the drives and
  removes it from its cache.
 
  Bruce, re-read what I wrote.  The escalades tell the drives to TURN OFF
  THEIR OWN CACHE.
 
 Some ATA drives would lie about that too IIRC.  Hopefully they've
 stopped doing it in the SATA era.

Ugh.  Now that would make for a particularly awful bit of firmware
implementation.  I'd think that if I found a SATA drive doing that I'd
be likely to strike the manufacturer off of the list for possible future
purchases...

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Steve Atkins


On May 9, 2006, at 2:16 AM, Hannes Dorbath wrote:


Hi,

I've just had some discussion with colleagues regarding the usage  
of hardware or software raid 1/10 for our linux based database  
servers.


I myself can't see much reason to spend $500 on high end controller  
cards for a simple Raid 1.


Any arguments pro or contra would be desirable.

From my experience and what I've read here:

+ Hardware Raids might be a bit easier to manage, if you never  
spend a few hours to learn Software Raid Tools.


+ There are situations in which Software Raids are faster, as CPU  
power has advanced dramatically in the last years and even high end  
controller cards cannot keep up with that.


+ Using SATA drives is always a bit of risk, as some drives are  
lying about whether they are caching or not.


Don't buy those drives. That's unrelated to whether you use hardware
or software RAID.



+ Using hardware controllers, the array becomes locked to a  
particular vendor. You can't switch controller vendors as the array  
meta information is stored proprietary. In case the Raid is broken  
to a level the controller can't recover automatically this might  
complicate manual recovery by specialists.


Yes. Fortunately we're using the RAID for database work, rather than  
file

storage, so we can use all the nice postgresql features for backing up
and replicating the data elsewhere, which avoids most of this issue.



+ Even battery backed controllers can't guarantee that data written  
to the drives is consistent after a power outage, neither that the  
drive does not corrupt something during the involuntary shutdown /  
power irregularities. (This is theoretical as any server will be  
UPS backed)


fsync of WAL log.

If you have a battery backed writeback cache then you can get the  
reliability
of fsyncing the WAL for every transaction, and the performance of not  
needing

to hit the disk for every transaction.

Also, if you're not doing that you'll need to dedicate a pair of  
spindles to the
WAL log if you want to get good performance, so that there'll be no  
seeking
on the WAL. With a writeback cache you can put the WAL on the same  
spindles
as the database and not lose much, if anything, in the way of  
performance.
If that saves you the cost of two additional spindles, and the space  
on your
drive shelf for them, you've just paid for a reasonably proced RAID  
controller.


Given those advantages... I can't imagine speccing a large system  
that didn't

have a battery-backed write-back cache in it. My dev systems mostly use
software RAID, if they use RAID at all. But my production boxes all  
use SATA
RAID (and I tell my customers to use controllers with BB cache,  
whether it

be SCSI or SATA).

My usual workloads are write-heavy. If yours are read-heavy that will
move the sweet spot around significantly, and I can easily imagine that
for a read-heavy load software RAID might be a much better match.

Cheers,
  Steve


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Joshua D. Drake


Don't buy those drives. That's unrelated to whether you use hardware
or software RAID.


Sorry that is an extremely misleading statement. SATA RAID is perfectly 
acceptable if you have a hardware raid controller with a battery backup 
controller.


And dollar for dollar, SCSI will NOT be faster nor have the hard drive 
capacity that you will get with SATA.


Sincerely,

Joshua D. Drake


--

   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions since 1997
http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Steve Atkins


On May 9, 2006, at 8:51 AM, Joshua D. Drake wrote:

(Using SATA drives is always a bit of risk, as some drives are lying  
about whether they are caching or not.)



Don't buy those drives. That's unrelated to whether you use hardware
or software RAID.


Sorry that is an extremely misleading statement. SATA RAID is  
perfectly acceptable if you have a hardware raid controller with a  
battery backup controller.


If the drive says it's hit the disk and it hasn't then the RAID  
controller

will have flushed the data from its cache (or flagged it as correctly
written). At that point the only place the data is stored is in the non
battery backed cache on the drive itself. If something fails then you'll
have lost data.

You're not suggesting that a hardware RAID controller will protect
you against drives that lie about sync, are you?



And dollar for dollar, SCSI will NOT be faster nor have the hard  
drive capacity that you will get with SATA.


Yup. That's why I use SATA RAID for all my databases.

Cheers,
  Steve

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Douglas McNaught
Vivek Khera [EMAIL PROTECTED] writes:

 On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote:

 And dollar for dollar, SCSI will NOT be faster nor have the hard
 drive capacity that you will get with SATA.

 Does this hold true still under heavy concurrent-write loads?  I'm
 preparing yet another big DB server and if SATA is a better option,
 I'm all (elephant) ears.

Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive.

-Doug

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Joshua D. Drake



You're not suggesting that a hardware RAID controller will protect
you against drives that lie about sync, are you?


Of course not, but which drives lie about sync that are SATA? Or more 
specifically SATA-II?


Sincerely,

Joshua D. Drake



--

   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions since 1997
http://www.commandprompt.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Steve Atkins


On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote:




You're not suggesting that a hardware RAID controller will protect
you against drives that lie about sync, are you?


Of course not, but which drives lie about sync that are SATA? Or  
more specifically SATA-II?


SATA-II, none that I'm aware of, but there's a long history of dodgy
behaviour designed to pump up benchmark results down in the
consumer drive space, and low end consumer space is where a
lot of SATA drives are. I wouldn't be surprised to see that beahviour
there still.

I was responding to the original posters assertion that drives lying
about sync were a reason not to buy SATA drives, by telling him
not to buy drives that lie about sync. You seem to have read this
as don't buy SATA drives, which is not what I said and not what I
meant.

Cheers,
  Steve

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Joshua D. Drake

Douglas McNaught wrote:

Vivek Khera [EMAIL PROTECTED] writes:


On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote:


And dollar for dollar, SCSI will NOT be faster nor have the hard
drive capacity that you will get with SATA.

Does this hold true still under heavy concurrent-write loads?  I'm
preparing yet another big DB server and if SATA is a better option,
I'm all (elephant) ears.


Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive.


Best I have seen is 10k but if I can put 4x the number of drives in the 
array at the same cost... I don't need 15k.


Joshua D. Drake



-Doug

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions since 1997
http://www.commandprompt.com/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Scott Marlowe
On Tue, 2006-05-09 at 12:52, Steve Atkins wrote:
 On May 9, 2006, at 8:51 AM, Joshua D. Drake wrote:
 
 (Using SATA drives is always a bit of risk, as some drives are lying  
 about whether they are caching or not.)
 
  Don't buy those drives. That's unrelated to whether you use hardware
  or software RAID.
 
  Sorry that is an extremely misleading statement. SATA RAID is  
  perfectly acceptable if you have a hardware raid controller with a  
  battery backup controller.
 
 If the drive says it's hit the disk and it hasn't then the RAID  
 controller
 will have flushed the data from its cache (or flagged it as correctly
 written). At that point the only place the data is stored is in the non
 battery backed cache on the drive itself. If something fails then you'll
 have lost data.
 
 You're not suggesting that a hardware RAID controller will protect
 you against drives that lie about sync, are you?

Actually, in the case of the Escalades at least, the answer is yes. 
Last year (maybe a bit more) someone was testing an IDE escalade
controller with drives that were known to lie, and it passed the power
plug pull test repeatedly.  Apparently, the escalades tell the drives to
turn off their cache.  While most all IDEs and a fair number of SATA
drives lie about cache fsyncing, they all seem to turn off the cache
when you ask.

And, since a hardware RAID controller with bbu cache has its own cache,
it's not like it really needs the one on the drives anyway.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Bruce Momjian
Scott Marlowe wrote:
 Actually, in the case of the Escalades at least, the answer is yes. 
 Last year (maybe a bit more) someone was testing an IDE escalade
 controller with drives that were known to lie, and it passed the power
 plug pull test repeatedly.  Apparently, the escalades tell the drives to
 turn off their cache.  While most all IDEs and a fair number of SATA
 drives lie about cache fsyncing, they all seem to turn off the cache
 when you ask.
 
 And, since a hardware RAID controller with bbu cache has its own cache,
 it's not like it really needs the one on the drives anyway.

You do if the controller thinks the data is already on the drives and
removes it from its cache.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Jean-Yves F. Barbier
Hi Hannes,

Hannes Dorbath a écrit :
 Hi,
 
 I've just had some discussion with colleagues regarding the usage of
 hardware or software raid 1/10 for our linux based database servers.
 
 I myself can't see much reason to spend $500 on high end controller
 cards for a simple Raid 1.

Naa, you can find ATA | SATA ctrlrs for about EUR30 !

 Any arguments pro or contra would be desirable.
 
 From my experience and what I've read here:
 
 + Hardware Raids might be a bit easier to manage, if you never spend a
 few hours to learn Software Raid Tools.

I'd the same (mostly as you still have to punch a command line for
most of the controlers)

 + There are situations in which Software Raids are faster, as CPU power
 has advanced dramatically in the last years and even high end controller
 cards cannot keep up with that.

Definitely NOT, however if your server doen't have a heavy load, the
software overload can't be noticed (essentially cache managing and
syncing)

For bi-core CPUs, it might be true


 + Using SATA drives is always a bit of risk, as some drives are lying
 about whether they are caching or not.

?? Do you intend to use your server without a UPS ??

 + Using hardware controllers, the array becomes locked to a particular
 vendor. You can't switch controller vendors as the array meta
 information is stored proprietary. In case the Raid is broken to a level
 the controller can't recover automatically this might complicate manual
 recovery by specialists.

?? Do you intend not to make backups ??

 + Even battery backed controllers can't guarantee that data written to
 the drives is consistent after a power outage, neither that the drive
 does not corrupt something during the involuntary shutdown / power
 irregularities. (This is theoretical as any server will be UPS backed)

RAID's laws:

1- RAID prevents you from loosing data on healthy disks, not from faulty
   disks,

1b- So format and reformat your RAID disks (whatever SCSI, ATA, SATA)
several times, with destructive tests (see -c -c option from
the mke2fs man) - It will ensure that disks are safe, and also
make a kind of burn test (might turn to... days of formating!),

2- RAID doesn't prevent you from power suply brokeage or electricity
   breakdown, so use a (LARGE) UPS,

2b- LARGE UPS because HDs are the components that have the higher power
consomption (a 700VA UPS gives me about 10-12 minutes on a machine
with a XP2200+, 1GB RAM and a 40GB HD, however this fall to..
less than 25 secondes with seven HDs ! all ATA),

2c- Use server box with redudancy power supplies,

3- As for any sensitive data, make regular backups or you'll be as
sitting duck.

Some hardware ctrlrs are able to avoid the loss of a disk if you turn
to have some faulty sectors (by relocating internally them); software
RAID doesn't as sectors *must* be @ the same (linear) addresses.

BUT a hardware controler is about EUR2000 and a (ATA/SATA) 500GB HD
is ~ EUR350.

That means you have to consider:

* The server disponibility (time to change a power supply if no
   redudancies, time to exchange a not hotswap HD... In fact, how much
   down time you can afford),

* The volume of the data (from which depends the size of the backup
  device),

* The backup device you'll use (tape or other HDs),

* The load of the server (and the number of simultaneous users =
  Soft|Hard, ATA/SATA|SCSI...),

* The money you can spend in such a server

* And most important, the color of your boss' tie the day you'll
   take the decision.

Hope it will help you

Jean-Yves


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Scott Lamb

On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote:
Of course not, but which drives lie about sync that are SATA? Or  
more specifically SATA-II?


I don't know the answer to this question, but have you seen this tool?

http://brad.livejournal.com/2116715.html

It attempts to experimentally determine if, with your operating  
system version, controller, and hard disk, fsync() does as claimed.  
Of course, experimentation can't prove the system is correct, but it  
can sometimes prove the system is broken.


I say it's worth running on any new model of disk, any new  
controller, or after the Linux kernel people rewrite everything (i.e.  
on every point release).


I have to admit to hypocrisy, though...I'm running with systems that  
other people ordered and installed, I doubt they were this thorough,  
and I don't have identical hardware to run tests on. So no real way  
to do this.


Regards,
Scott

--
Scott Lamb http://www.slamb.org/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread Greg Stark
Douglas McNaught [EMAIL PROTECTED] writes:

 Vivek Khera [EMAIL PROTECTED] writes:
 
  On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote:
 
  And dollar for dollar, SCSI will NOT be faster nor have the hard
  drive capacity that you will get with SATA.
 
  Does this hold true still under heavy concurrent-write loads?  I'm
  preparing yet another big DB server and if SATA is a better option,
  I'm all (elephant) ears.
 
 Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive.

Well, dollar for dollar you would get the best performance from slower drives
anyways since it would give you more spindles. 15kRPM drives are *expensive*.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings