-performance-ow...@postgresql.org] On Behalf Of Rick Otten
Sent: Wednesday, February 24, 2016 9:06 AM
To: Dave Stibrany
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Filesystem and Disk Partitioning for New Server Setup
An LVM gives you more options.
Without an LVM you would add a disk to
An LVM gives you more options.
Without an LVM you would add a disk to the system, create a tablespace, and
then move some of your tables over to the new disk. Or, you'd take a full
backup, rebuild your file system, and then restore from backup onto the
newer, larger disk configuration. Or you'd
Thanks for the advice, Rick.
I have an 8 disk chassis, so possible extension paths down the line are
adding raid1 for WALs, adding another RAID10, or creating a 8 disk RAID10.
Would LVM make this type of addition easier?
On Wed, Feb 24, 2016 at 6:08 AM, Rick Otten
wrote:
>
> 1) I'd go with xfs
1) I'd go with xfs. zfs might be a good alternative, but the last time I
tried it, it was really unstable (on Linux). I may have gotten a lot
better, but xfs is a safe bet and well understood.
2) An LVM is just an extra couple of commands. These days that is not a
lot of complexity given what y
I'm about to install a new production server and wanted some advice regarding
filesystems and disk partitioning.
The server is:
- Dell PowerEdge R430
- 1 x Intel Xeon E5-2620 2.4GHz
- 32 GB RAM
- 4 x 600GB 10k SAS
- PERC H730P Raid Controller with 2GB cache
The drives will be set up in one RAID-
"Gregory Stark" <[EMAIL PROTECTED]> writes:
> <[EMAIL PROTECTED]> writes:
>
>>> If you are completely over-writing an entire stripe, there's no reason to
>>> read the existing data; you would just calculate the parity information from
>>> the new data. Any good controller should take that approach
<[EMAIL PROTECTED]> writes:
>> If you are completely over-writing an entire stripe, there's no reason to
>> read the existing data; you would just calculate the parity information from
>> the new data. Any good controller should take that approach.
>
> in theory yes, in practice the OS writes usua
On Sat, 16 Aug 2008, Decibel! wrote:
On Aug 13, 2008, at 2:54 PM, Henrik wrote:
Additionally, you need to be careful of what size writes you're using. If
you're doing random writes that perfectly align with the raid stripe size,
you'll see virtually no RAID5 overhead, and you'll get the perfor
On Aug 13, 2008, at 2:54 PM, Henrik wrote:
Additionally, you need to be careful of what size writes you're
using. If you're doing random writes that perfectly align with the
raid stripe size, you'll see virtually no RAID5 overhead, and
you'll get the performance of N-1 drives, as opposed to
Greg Smith wrote:
On Wed, 13 Aug 2008, Ron Mayer wrote:
Second of all - ext3 fsync() appears to me to be *extremely* stupid.
It only seems to correctly do the correct flushing (and waiting) for a
drive's cache to be flushed when a file's inode has changed.
This is bad, but the way PostgreSQL
I've seen it written a couple of times in this thread, and in the
wikipedia article, that SOME sw raid configs don't support write
barriers. This implies that some do. Which ones do and which ones
don't? Does anybody have a list of them?
I was mainly wondering if sw RAID0 on top of hw RAID1 wou
On Wed, 13 Aug 2008, Ron Mayer wrote:
First off - some IDE drives don't even support the relatively recent ATA
command that apparently lets the software know when a cache flush is
complete.
Right, so this is one reason you can't assume barriers will be available.
And barriers don't work rega
Scott Marlowe wrote:
IDE came up corrupted every single time.
Greg Smith wrote:
you've drank the kool-aid ... completely
ridiculous ...unsafe fsync ... md0 RAID-1
array (aren't there issues with md and the barriers?)
Alright - I'll eat my words. Or mostly.
I still haven't found IDE drives
13 aug 2008 kl. 17.13 skrev Decibel!:
On Aug 11, 2008, at 9:01 AM, Jeff wrote:
On Aug 11, 2008, at 5:17 AM, Henrik wrote:
OK, changed the SAS RAID 10 to RAID 5 and now my random writes are
handing 112 MB/ sek. So it is almsot twice as fast as the RAID10
with the same disks. Any ideas why?
On Wed, 13 Aug 2008, Ron Mayer wrote:
I assume test_fsync in the postgres source distribution is
a decent way to see?
Not really. It takes too long (runs too many tests you don't care about)
and doesn't spit out the results the way you want them--TPS, not average
time.
You can do it with
On Aug 11, 2008, at 9:01 AM, Jeff wrote:
On Aug 11, 2008, at 5:17 AM, Henrik wrote:
OK, changed the SAS RAID 10 to RAID 5 and now my random writes are
handing 112 MB/ sek. So it is almsot twice as fast as the RAID10
with the same disks. Any ideas why?
Is the iozone tests faulty?
does IO
On Wed, Aug 13, 2008 at 8:41 AM, Ron Mayer
<[EMAIL PROTECTED]> wrote:
> Greg Smith wrote:
> But I still am looking for any evidence that there were any
> widely shipped SATA (or even IDE drives) that were at fault,
> as opposed to filesystem bugs and poor settings of defaults.
Well, if they're ge
Greg Smith wrote:
The below disk writes impossibly fast when I issue a sequence of fsync
'k. I've got some homework. I'll be trying to reproduce similar
with md raid, old IDE drives, etc to see if I can reproduce them.
I assume test_fsync in the postgres source distribution is
a decent way to
Scott Marlowe wrote:
On Tue, Aug 12, 2008 at 10:28 PM, Ron Mayer ...wrote:
Scott Marlowe wrote:
I can attest to the 2.4 kernel ...
...SCSI...AFAICT the write barrier support...
Tested both by pulling the power plug. The SCSI was pulled 10 times
while running 600 or so concurrent pgbench thr
On Tue, 12 Aug 2008, Ron Mayer wrote:
Really old software (notably 2.4 linux kernels) didn't send
cache synchronizing commands for SCSI nor either ATA;
Surely not true. Write cache flushing has been a known problem in the
computer science world for several tens of years. The difference is that
On Tue, Aug 12, 2008 at 10:28 PM, Ron Mayer
<[EMAIL PROTECTED]> wrote:
> Scott Marlowe wrote:
>>
>> I can attest to the 2.4 kernel not being able to guarantee fsync on
>> IDE drives.
>
> Sure. But note that it won't for SCSI either; since AFAICT the write
> barrier support was implemented at the s
On Tue, 12 Aug 2008, Ron Mayer wrote:
Really old software (notably 2.4 linux kernels) didn't send
cache synchronizing commands for SCSI nor either ATA; but
it seems well thought through in the 2.6 kernels as described
in the Linux kernel documentation.
http://www.mjmwired.net/kernel/Documentatio
Scott Marlowe wrote:
I can attest to the 2.4 kernel not being able to guarantee fsync on
IDE drives.
Sure. But note that it won't for SCSI either; since AFAICT the write
barrier support was implemented at the same time for both.
--
Sent via pgsql-performance mailing list (pgsql-performance@
I'm not an expert on which and where -- its been a while since I was exposed
to the issue. From what I've read in a few places over time (
storagereview.com, linux and windows patches or knowledge base articles), it
happens from time to time. Drives usually get firmware updates quickly.
Drivers /
On Tue, 12 Aug 2008, Ron Mayer wrote:
Scott Carey wrote:
Some SATA drives were known to not flush their cache when told to.
Can you name one? The ATA commands seem pretty clear on the matter,
and ISTM most of the reports of these issues came from before
Linux had write-barrier support.
I c
On Tue, Aug 12, 2008 at 6:23 PM, Scott Carey <[EMAIL PROTECTED]> wrote:
> Some SATA drives were known to not flush their cache when told to.
> Some file systems don't know about this (UFS, older linux kernels, etc).
>
> So yes, if your OS / File System / Controller card combo properly sends the
> w
Scott Carey wrote:
Some SATA drives were known to not flush their cache when told to.
Can you name one? The ATA commands seem pretty clear on the matter,
and ISTM most of the reports of these issues came from before
Linux had write-barrier support.
I've yet to hear of a drive with the problem
Some SATA drives were known to not flush their cache when told to.
Some file systems don't know about this (UFS, older linux kernels, etc).
So yes, if your OS / File System / Controller card combo properly sends the
write cache flush command, and the drive is not a flawed one, all is well.
Most sh
Greg Smith wrote:
some write cache in the SATA disks...Since all non-battery backed caches
need to get turned off for reliable database use, you might want to
double-check that on the controller that's driving the SATA disks.
Is this really true?
Doesn't the ATA "FLUSH CACHE" command (say,
On Tue, Aug 12, 2008 at 1:40 PM, Henrik <[EMAIL PROTECTED]> wrote:
> Hi again all,
>
> Just wanted to give you an update.
>
> Talked to Dell tech support and they recommended using write-through(!)
> caching in RAID10 configuration. Well, it didn't work and got even worse
> performance.
Someone at
Hi again all,
Just wanted to give you an update.
Talked to Dell tech support and they recommended using write-
through(!) caching in RAID10 configuration. Well, it didn't work and
got even worse performance.
Anyone have an estimated what a RAID10 on 4 15k SAS disks should
generate in rand
On Sun, 10 Aug 2008, Henrik wrote:
Normally, when a SATA implementation is running significantly faster than a
SAS one, it's because there's some write cache in the SATA disks turned on
(which they usually are unless you go out of your way to disable them).
Lucky for my I have BBU on all my co
On Aug 11, 2008, at 5:17 AM, Henrik wrote:
OK, changed the SAS RAID 10 to RAID 5 and now my random writes are
handing 112 MB/ sek. So it is almsot twice as fast as the RAID10
with the same disks. Any ideas why?
Is the iozone tests faulty?
does IOzone disable the os caches?
If not you ne
On Mon, Aug 11, 2008 at 6:08 AM, Henrik <[EMAIL PROTECTED]> wrote:
> 11 aug 2008 kl. 12.35 skrev Glyn Astill:
>
> It feels like there is something fishy going on.
>>>
>>> Maybe the RAID 10
>
> implementation on the PERC/6e is crap?
>>
>> It's possible. We had a bunch of perc/
11 aug 2008 kl. 12.35 skrev Glyn Astill:
It feels like there is something fishy going on.
Maybe the RAID 10
implementation on the PERC/6e is crap?
It's possible. We had a bunch of perc/5i SAS raid cards in our
servers that performed quite well in Raid 5 but were shite in Raid
10. I
> >
> >> It feels like there is something fishy going on.
> Maybe the RAID 10
> >> implementation on the PERC/6e is crap?
> >
It's possible. We had a bunch of perc/5i SAS raid cards in our servers that
performed quite well in Raid 5 but were shite in Raid 10. I switched them out
for Adaptec
RAID5 parity overhead.
- Luke
- Original Message -
From: [EMAIL PROTECTED] <[EMAIL PROTECTED]
>
To: pgsql-performance@postgresql.org >
Sent: Fri Aug 08 10:23:55 2008
Subject: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Hello list,
I have a server with a direct attached storage con
9 aug 2008 kl. 00.47 skrev Greg Smith:
On Fri, 8 Aug 2008, Henrik wrote:
It feels like there is something fishy going on. Maybe the RAID 10
implementation on the PERC/6e is crap?
Normally, when a SATA implementation is running significantly faster
than a SAS one, it's because there's som
/s, sans the RAID5 parity overhead.
- Luke
- Original Message -
From: [EMAIL PROTECTED]
<[EMAIL PROTECTED]>
To: pgsql-performance@postgresql.org
Sent: Fri Aug 08 10:23:55 2008
Subject: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Hello list,
I have a server with a dir
On Fri, 8 Aug 2008, Henrik wrote:
It feels like there is something fishy going on. Maybe the RAID 10
implementation on the PERC/6e is crap?
Normally, when a SATA implementation is running significantly faster than
a SAS one, it's because there's some write cache in the SATA disks turned
on (
On Thu, 7 Aug 2008, Henrik wrote:
My first idea was to have one partition on the RAID 10 using ext3 with
data=writeback, noatime as mount options.
But I wonder if I should have 2 partitions on the RAID 10 one for the PGDATA
dir using ext3 and one partition for XLOGS using ext2.
Really depen
On 09/08/2008, Henrik <[EMAIL PROTECTED]> wrote:
> But random writes should be faster on a RAID10 as it doesn't need to
> calculate parity. That is why people suggest RAID 10 for datases, correct?
If it had 10 spindles as opposed to 4 ... with 4 drives the "split" is (because
you're striping and m
8 aug 2008 kl. 18.44 skrev Mark Wong:
On Fri, Aug 8, 2008 at 8:08 AM, Henrik <[EMAIL PROTECTED]> wrote:
But random writes should be faster on a RAID10 as it doesn't need to
calculate parity. That is why people suggest RAID 10 for datases,
correct?
I can understand that RAID5 can be faster w
On Fri, Aug 8, 2008 at 8:08 AM, Henrik <[EMAIL PROTECTED]> wrote:
> But random writes should be faster on a RAID10 as it doesn't need to
> calculate parity. That is why people suggest RAID 10 for datases, correct?
> I can understand that RAID5 can be faster with sequential writes.
There is some da
ROTECTED]
>
To: pgsql-performance@postgresql.org [EMAIL PROTECTED]>
Sent: Fri Aug 08 10:23:55 2008
Subject: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Hello list,
I have a server with a direct attached storage containing 4 15k SAS
drives and 6 standard SATA drives.
The server is a
PROTECTED] <[EMAIL PROTECTED]>
To: pgsql-performance@postgresql.org
Sent: Fri Aug 08 10:23:55 2008
Subject: [PERFORM] Filesystem benchmarking for pg 8.3.3 server
Hello list,
I have a server with a direct attached storage containing 4 15k SAS
drives and 6 standard SATA drives.
The server is
Hello list,
I have a server with a direct attached storage containing 4 15k SAS
drives and 6 standard SATA drives.
The server is a quad core xeon with 16GB ram.
Both server and DAS has dual PERC/6E raid controllers with 512 MB BBU
There is 2 raid set configured.
One RAID 10 containing 4 SAS d
Hi list,
I'm helping a customer with their new postgresql server and have some
questions.
The servers is connected to a SAN with dual raid cards which all have
512MB cache with BBU.
The configuration they set up is now.
2 SAS 15K drives in RAID 1 on the internal controller for OS.
6 SAS
On Sun, 6 Jul 2008, Jaime Casanova wrote:
Here http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm I read:
"""
Combining these two, an optimal fstab for the WAL might look like this:
/dev/hda2 /var ext3 defaults,writeback,noatime 1 2
"""
Is this info accurate?
Nah, that guy doe
Hi,
Here http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm I read:
"""
Combining these two, an optimal fstab for the WAL might look like this:
/dev/hda2 /var ext3 defaults,writeback,noatime 1 2
"""
Is this info accurate?
I also read on other document from the "technical document
Yes, I tried all WAL sync methods, but there was no difference...
However, there was a huge difference when I run the same tests under
Solaris10 - 'fdatasync' option gave the best performance level. On the
same time direct I/O did not make difference on Solaris 10 :)
So the main rule - there is n
On 7/9/07, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
BTW, it might be worth trying the different wal_sync_methods. IIRC,
Jonah's seen some good results from open_datasync.
On Linux, using ext3, reiser, or jfs, I've seen open_sync perform
quite better than fsync/fdatasync in most of my tests. But
On Tue, Jul 03, 2007 at 04:06:29PM +0100, Heikki Linnakangas wrote:
> Dimitri wrote:
> >I'm very curious to know if we may expect or guarantee any data
> >consistency with WAL sync=OFF but using file system mounted in Direct
> >I/O mode (means every write() system call called by PG really writes
>
Gregory, thanks for good questions! :))
I got more lights on my throughput here :))
The running OS is Solaris9 (customer is still not ready to upgrade to
Sol10), and I think the main "sync" issue is coming from the old UFS
implementation... UFS mounted with 'forcedirectio' option uses
different "
"Dimitri" <[EMAIL PROTECTED]> writes:
> Yes Gregory, that's why I'm asking, because from 1800 transactions/sec
> I'm jumping to 2800 transactions/sec! and it's more than important
> performance level increase :))
wow. That's kind of suspicious though. Does the new configuration take
advantage o
Yes Gregory, that's why I'm asking, because from 1800 transactions/sec
I'm jumping to 2800 transactions/sec! and it's more than important
performance level increase :))
Rgds,
-Dimitri
On 7/4/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
"Dimitri" <[EMAIL PROTECTED]> writes:
> Yes, disk drives
"Dimitri" <[EMAIL PROTECTED]> writes:
> Yes, disk drives are also having cache disabled or having cache on
> controllers and battery protected (in case of more high-level
> storage) - but is it enough to expect data consistency?... (I was
> surprised about checkpoint sync, but does it always cal
Yes, disk drives are also having cache disabled or having cache on
controllers and battery protected (in case of more high-level
storage) - but is it enough to expect data consistency?... (I was
surprised about checkpoint sync, but does it always calls write()
anyway? because in this way it shoul
Dimitri wrote:
I'm very curious to know if we may expect or guarantee any data
consistency with WAL sync=OFF but using file system mounted in Direct
I/O mode (means every write() system call called by PG really writes
to disk before return)...
You'd have to turn that mode on on the data drives
All,
I'm very curious to know if we may expect or guarantee any data
consistency with WAL sync=OFF but using file system mounted in Direct
I/O mode (means every write() system call called by PG really writes
to disk before return)...
So may we expect data consistency:
- none?
- per checkpoin
On Tue, Dec 20, 2005 at 01:26:00PM +, David Roussel wrote:
> Note that you can do the taring, zipping, copying and untaring
> concurrentlt. I can't remember the exactl netcat command line options,
> but it goes something like this
>
> Box1:
> tar czvf - myfiles/* | netcat myserver:12345
>
David Lang wrote:
ext3 has an option to make searching directories faster (htree), but
enabling it kills performance when you create files. And this doesn't
help with large files.
The ReiserFS white paper talks about the data structure he uses to
store directories (some kind of tree),
On Fri, 2 Dec 2005, Qingqing Zhou wrote:
I don't have all the numbers readily available (and I didn't do all the
tests on every filesystem), but I found that even with only 1000
files/directory ext3 had some problems, and if you enabled dir_hash some
functions would speed up, but writing lots o
On Fri, 2 Dec 2005, David Lang wrote:
>
> I don't have all the numbers readily available (and I didn't do all the
> tests on every filesystem), but I found that even with only 1000
> files/directory ext3 had some problems, and if you enabled dir_hash some
> functions would speed up, but writing l
On Thu, 1 Dec 2005, Qingqing Zhou wrote:
"David Lang" <[EMAIL PROTECTED]> wrote
a few weeks ago I did a series of tests to compare different filesystems.
the test was for a different purpose so the particulars are not what I
woud do for testing aimed at postgres, but I think the data is relave
"David Lang" <[EMAIL PROTECTED]> wrote
>
> a few weeks ago I did a series of tests to compare different filesystems.
> the test was for a different purpose so the particulars are not what I
> woud do for testing aimed at postgres, but I think the data is relavent)
> and I saw major differences
this subject has come up a couple times just today (and it looks like one
that keeps popping up).
under linux ext2/3 have two known weaknesses (or rather one weakness with
two manifestations). searching through large objects on disk is slow, this
applies to both directories (creating, opening,
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Martin Fandel wrote:
|
| I've tested dump's and copy's with the xfs-installation. It's
| faster than before. But the transactions-query's are still slower
| than the reiserfs-installation.
|
| Are any fstab-/mount-options recommended for xfs?
|
Hello
Hi,
ah you're right. :) I forgot to symlink the pg_xlog-dir to another
partition. Now it's a bit faster than before. But not faster than
the same installation with reiserfs:
[EMAIL PROTECTED]:~> pgbench -h 127.0.0.1 -p 5432 -c150 -t5 pgbench
starting vacuum...end.
transaction type: TPC-B (sort o
On Wed, Jun 08, 2005 at 09:36:31AM +0200, Martin Fandel wrote:
I've installed the same installation of my reiser-fs-postgres-8.0.1
with xfs.
Do you have pg_xlog on a seperate partition? I've noticed that ext2
seems to have better performance than xfs for the pg_xlog workload (with
all the syncs
Hi,
I've installed the same installation of my reiser-fs-postgres-8.0.1
with xfs.
Now my pgbench shows the following results:
[EMAIL PROTECTED]:~> pgbench -h 127.0.0.1 -p 5432 -c150 -t5 pgbench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 150
numb
On Fri, 3 Jun 2005 09:06:41 +0200
"Martin Fandel" <[EMAIL PROTECTED]> wrote:
i have only a little question. Which filesystem is
preferred for postgresql? I'm plan to use xfs
(before i used reiserfs). The reason
is the xfs_freeze Tool to make filesystem-snapshots.
XFS has worked great for us
Hi
i have tested a xfs+LVM installation with the scalix (HP OpenMail)
Mailserver (it's a little time ago). I had at that time some problems
using xfs_freeze. I used a script for freezing the fs and making storing
the snapshots. Sometimes the complete Server hangs (no blinking cursor,
no possible
We have been using XFS for about 6 months now and it has even tolerated
a controller card crash. So far we have mostly good things to
report about XFS. I benchmarked raw throughputs at various stripe
sizes, and XFS came out on top for us against reiser and ext3. I
also used it because of it's su
Martin Fandel wrote:
Hi @ all,
i have only a little question. Which filesystem is preferred for
postgresql? I'm plan to use xfs (before i used reiserfs). The reason
is the xfs_freeze Tool to make filesystem-snapshots.
Is the performance better than reiserfs, is it reliable?
I used postgre
Hi @ all,
i have only a little question. Which filesystem is preferred for
postgresql? I'm plan to use xfs (before i used reiserfs). The reason
is the xfs_freeze Tool to make filesystem-snapshots.
Is the performance better than reiserfs, is it reliable?
best regards,
Martin
-
CH <[EMAIL PROTECTED]> writes:
> So the clog is not written to every time the xlog is written to?
No. One clog page holds 32000 transactions' worth of transaction status
values, so on average we need only one clog page write per 32000
transactions.
> On a related issue, what's the connection bet
Hi!
> > Does that mean only the xlog, or also the clog? As far as I understand, the
> > clog contains some meta-information on the xlog, so presumably it is
> > flushed to disc synchronously together with the xlog? That would mean that
> > they each need a separate disk to prevent one disk having
Shridhar Daithankar <[EMAIL PROTECTED]> writes:
> On Wednesday 19 May 2004 13:02, [EMAIL PROTECTED] wrote:
> - If you can put WAL on separate disk(s), all the better.
>>
>> Does that mean only the xlog, or also the clog?
> You can put clog and xlog on same drive.
You can, but I think you shouldn
On Wednesday 19 May 2004 13:02, [EMAIL PROTECTED] wrote:
> > - If you can put WAL on separate disk(s), all the better.
>
> Does that mean only the xlog, or also the clog? As far as I understand, the
> clog contains some meta-information on the xlog, so presumably it is
> flushed to disc synchrono
Hi!
On Mon, May 17, 2004 at 06:04:54PM +0100, Richard Huxton wrote:
> [EMAIL PROTECTED] wrote:
> > [...]
>
> In no official capacity whatsoever, welcome aboard.
Thanks ;-)
> > There is just one point where I found the documentation lacking any
> > description and practical hints (as opposed to
81 matches
Mail list logo