Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Jim C. Nasby
On Thu, Jul 28, 2005 at 05:00:44PM -0700, Mark Wong wrote:
 On Thu, 28 Jul 2005 16:55:55 -0700
 Mark Wong [EMAIL PROTECTED] wrote:
 
  On Thu, 28 Jul 2005 18:48:09 -0500
  Jim C. Nasby [EMAIL PROTECTED] wrote:
  
   On Thu, Jul 28, 2005 at 04:15:31PM -0700, Mark Wong wrote:
On Thu, 28 Jul 2005 17:17:25 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
   This 4-way has 8GB of memory and four Adaptec 2200s controllers 
   attached
   to 80 spindles (eight 10-disk arrays).  For those familiar with 
   the
   schema, here is a visual of the disk layout:
 
   http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html
 
 Have you by-chance tried it with the logs and data just going to
 seperate RAID10s? I'm wondering if a large RAID10 would do a better 
 job
 of spreading the load than segmenting things to specific drives.

No, haven't tried that.  That would reduce my number of spindles as I
scale up. ;)  I have the disks attached as JBODs and use LVM2 to stripe
the disks together.
   
   I'm confused... why would it reduce the number of spindles? Is
   everything just striped right now? You could always s/RAID10/RAID0/.
  
  RAID10 requires a minimum of 4 devices per LUN, I think.  At least 2
  devices in a mirror, at least 2 mirrored devices to stripe.
  
  RAID0 wouldn't be any different than what I have now, except if I use
  hardware RAID I can't stripe across controllers.  That's treating LVM2
  striping equal to software RAID0 of course.
 
 Oops, spindles was the wrong word to describe what I was losing.  But I
 wouldn't be able to spread the reads/writes across as many spindles if I
 have any mirroring.

Not sure I fully understand what you're trying to say, but it seems like
it might still be worth trying my original idea of just turning all 80
disks into one giant RAID0/striped array and see how much more bandwidth
you get out of that. At a minimum it would allow you to utilize the
remaining spindles, which appear to be unused right now.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Mark Wong
On Fri, 29 Jul 2005 14:39:08 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 On Thu, Jul 28, 2005 at 05:00:44PM -0700, Mark Wong wrote:
  On Thu, 28 Jul 2005 16:55:55 -0700
  Mark Wong [EMAIL PROTECTED] wrote:
  
   On Thu, 28 Jul 2005 18:48:09 -0500
   Jim C. Nasby [EMAIL PROTECTED] wrote:
   
On Thu, Jul 28, 2005 at 04:15:31PM -0700, Mark Wong wrote:
 On Thu, 28 Jul 2005 17:17:25 -0500
 Jim C. Nasby [EMAIL PROTECTED] wrote:
 
  On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
This 4-way has 8GB of memory and four Adaptec 2200s controllers 
attached
to 80 spindles (eight 10-disk arrays).  For those familiar with 
the
schema, here is a visual of the disk layout:

http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html
  
  Have you by-chance tried it with the logs and data just going to
  seperate RAID10s? I'm wondering if a large RAID10 would do a better 
  job
  of spreading the load than segmenting things to specific drives.
 
 No, haven't tried that.  That would reduce my number of spindles as I
 scale up. ;)  I have the disks attached as JBODs and use LVM2 to 
 stripe
 the disks together.

I'm confused... why would it reduce the number of spindles? Is
everything just striped right now? You could always s/RAID10/RAID0/.
   
   RAID10 requires a minimum of 4 devices per LUN, I think.  At least 2
   devices in a mirror, at least 2 mirrored devices to stripe.
   
   RAID0 wouldn't be any different than what I have now, except if I use
   hardware RAID I can't stripe across controllers.  That's treating LVM2
   striping equal to software RAID0 of course.
  
  Oops, spindles was the wrong word to describe what I was losing.  But I
  wouldn't be able to spread the reads/writes across as many spindles if I
  have any mirroring.
 
 Not sure I fully understand what you're trying to say, but it seems like
 it might still be worth trying my original idea of just turning all 80
 disks into one giant RAID0/striped array and see how much more bandwidth
 you get out of that. At a minimum it would allow you to utilize the
 remaining spindles, which appear to be unused right now.

I have done that before actually, when the tablespace patch came out.  I
was able to get almost 40% more throughput with half the drives than
striping all the disks together.

Mark

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Jim C. Nasby
On Fri, Jul 29, 2005 at 12:51:57PM -0700, Mark Wong wrote:
  Not sure I fully understand what you're trying to say, but it seems like
  it might still be worth trying my original idea of just turning all 80
  disks into one giant RAID0/striped array and see how much more bandwidth
  you get out of that. At a minimum it would allow you to utilize the
  remaining spindles, which appear to be unused right now.
 
 I have done that before actually, when the tablespace patch came out.  I
 was able to get almost 40% more throughput with half the drives than
 striping all the disks together.

Wow, that's a pretty stunning difference... any idea why?

I think it might be very useful to see some raw disk IO benchmarks...
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Mark Wong
On Fri, 29 Jul 2005 14:57:42 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 On Fri, Jul 29, 2005 at 12:51:57PM -0700, Mark Wong wrote:
   Not sure I fully understand what you're trying to say, but it seems like
   it might still be worth trying my original idea of just turning all 80
   disks into one giant RAID0/striped array and see how much more bandwidth
   you get out of that. At a minimum it would allow you to utilize the
   remaining spindles, which appear to be unused right now.
  
  I have done that before actually, when the tablespace patch came out.  I
  was able to get almost 40% more throughput with half the drives than
  striping all the disks together.
 
 Wow, that's a pretty stunning difference... any idea why?
 
 I think it might be very useful to see some raw disk IO benchmarks...

A lot of it has to do with how the disk is being accessed.  The log is
ideally doing sequential writes, some tables only read, some
read/writer.  The varying access patterns between tables/log/indexes can
negatively conflict with each other.

Some of it has to do with how the OS deals with file systems.  I think
on linux is there a page buffer flush daemon per file system.  A real OS
person can answer this part better than me.

Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Josh Berkus
Mark,

 I have done that before actually, when the tablespace patch came out.  I
 was able to get almost 40% more throughput with half the drives than
 striping all the disks together.

That's not the figures you showed me.   In your report last year it was 14%, 
not 40%.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Mark Wong
On Fri, 29 Jul 2005 13:35:32 -0700
Josh Berkus josh@agliodbs.com wrote:

 Mark,
 
  I have done that before actually, when the tablespace patch came out.  I
  was able to get almost 40% more throughput with half the drives than
  striping all the disks together.
 
 That's not the figures you showed me.   In your report last year it was 14%, 
 not 40%.

Sorry I wasn't clear, I'll elaborate.  In the BOF at LWE-SF 2004, I did
report a 13% improvement but at the same time I also said I had not
quantified it as well as I would have liked and was still working on a
better physical disk layout.  For LWE-Boston 2005, I did a little better
and reported 35% (and misquoted myself to say 40%) here in these slides:

http://developer.osdl.org/markw/presentations/lwebos2005bof.sxi

In that test I still had not separated the primary keys into separate
tablespaces.  I would imagine there is more throughput to be gained by
doing that.  I have the build scripts do that now, but again haven't
quite quantified it yet.

Mark

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Mark Wong
On Fri, 29 Jul 2005 13:19:06 -0700
Luke Lonergan [EMAIL PROTECTED] wrote:

 Mark,
 
 On 7/29/05 12:51 PM, Mark Wong [EMAIL PROTECTED] wrote:
 
  Adaptec 2200s
 
 Have you tried non-RAID SCSI controllers in this configuration?  When we
 used the Adaptec 2120s previously, we got very poor performance using SW
 RAID (though much better than HW RAID) compared to simple SCSI controllers.
 
 See attached, particularly the RAW RESULTS tab.  Comments welcome :-)

No, we actually don't have any non-RAID SCSI controllers to try...

Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-29 Thread Jim C. Nasby
On Fri, Jul 29, 2005 at 01:11:35PM -0700, Mark Wong wrote:
 On Fri, 29 Jul 2005 14:57:42 -0500
 Jim C. Nasby [EMAIL PROTECTED] wrote:
 
  On Fri, Jul 29, 2005 at 12:51:57PM -0700, Mark Wong wrote:
Not sure I fully understand what you're trying to say, but it seems like
it might still be worth trying my original idea of just turning all 80
disks into one giant RAID0/striped array and see how much more bandwidth
you get out of that. At a minimum it would allow you to utilize the
remaining spindles, which appear to be unused right now.
   
   I have done that before actually, when the tablespace patch came out.  I
   was able to get almost 40% more throughput with half the drives than
   striping all the disks together.
  
  Wow, that's a pretty stunning difference... any idea why?
  
  I think it might be very useful to see some raw disk IO benchmarks...
 
 A lot of it has to do with how the disk is being accessed.  The log is
 ideally doing sequential writes, some tables only read, some
 read/writer.  The varying access patterns between tables/log/indexes can
 negatively conflict with each other.

Well, seperating logs from everything else does make a lot of sense.
Still interesting that you've been able to see so much gain.

 Some of it has to do with how the OS deals with file systems.  I think
 on linux is there a page buffer flush daemon per file system.  A real OS
 person can answer this part better than me.

So, about testing with FreeBSD :P
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Jim C. Nasby
On Wed, Jul 27, 2005 at 09:31:39PM -0700, Mark Wong wrote:
 After seeing the discussion about how bad the disk performance is with a
 lot of scsi controllers on linux, I'm wondering if we should run some
 disk tests to see how things look.

I'd be very interested to see how FreeBSD compares to Linux on the
box... how hard would it be to do some form of multi-boot?
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Jim C. Nasby
On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
  This 4-way has 8GB of memory and four Adaptec 2200s controllers attached
  to 80 spindles (eight 10-disk arrays).  For those familiar with the
  schema, here is a visual of the disk layout:
  http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html

Have you by-chance tried it with the logs and data just going to
seperate RAID10s? I'm wondering if a large RAID10 would do a better job
of spreading the load than segmenting things to specific drives.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Jim C. Nasby
On Thu, Jul 28, 2005 at 05:14:41PM -0500, Jim C. Nasby wrote:
 On Wed, Jul 27, 2005 at 09:31:39PM -0700, Mark Wong wrote:
  After seeing the discussion about how bad the disk performance is with a
  lot of scsi controllers on linux, I'm wondering if we should run some
  disk tests to see how things look.
 
 I'd be very interested to see how FreeBSD compares to Linux on the
 box... how hard would it be to do some form of multi-boot?

Err, I sent that before realizing where the tests were happening. I'm
guessing the answer is 'no'. :)
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Mark Wong
On Thu, 28 Jul 2005 17:19:34 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 On Thu, Jul 28, 2005 at 05:14:41PM -0500, Jim C. Nasby wrote:
  On Wed, Jul 27, 2005 at 09:31:39PM -0700, Mark Wong wrote:
   After seeing the discussion about how bad the disk performance is with a
   lot of scsi controllers on linux, I'm wondering if we should run some
   disk tests to see how things look.
  
  I'd be very interested to see how FreeBSD compares to Linux on the
  box... how hard would it be to do some form of multi-boot?
 
 Err, I sent that before realizing where the tests were happening. I'm
 guessing the answer is 'no'. :)

Yeah, I might get in trouble. ;)

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Mark Wong
On Thu, 28 Jul 2005 17:17:25 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
   This 4-way has 8GB of memory and four Adaptec 2200s controllers attached
   to 80 spindles (eight 10-disk arrays).  For those familiar with the
   schema, here is a visual of the disk layout:
 http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html
 
 Have you by-chance tried it with the logs and data just going to
 seperate RAID10s? I'm wondering if a large RAID10 would do a better job
 of spreading the load than segmenting things to specific drives.

No, haven't tried that.  That would reduce my number of spindles as I
scale up. ;)  I have the disks attached as JBODs and use LVM2 to stripe
the disks together.

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Jim C. Nasby
On Thu, Jul 28, 2005 at 04:15:31PM -0700, Mark Wong wrote:
 On Thu, 28 Jul 2005 17:17:25 -0500
 Jim C. Nasby [EMAIL PROTECTED] wrote:
 
  On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
This 4-way has 8GB of memory and four Adaptec 2200s controllers attached
to 80 spindles (eight 10-disk arrays).  For those familiar with the
schema, here is a visual of the disk layout:

http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html
  
  Have you by-chance tried it with the logs and data just going to
  seperate RAID10s? I'm wondering if a large RAID10 would do a better job
  of spreading the load than segmenting things to specific drives.
 
 No, haven't tried that.  That would reduce my number of spindles as I
 scale up. ;)  I have the disks attached as JBODs and use LVM2 to stripe
 the disks together.

I'm confused... why would it reduce the number of spindles? Is
everything just striped right now? You could always s/RAID10/RAID0/.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-28 Thread Mark Wong
On Thu, 28 Jul 2005 18:48:09 -0500
Jim C. Nasby [EMAIL PROTECTED] wrote:

 On Thu, Jul 28, 2005 at 04:15:31PM -0700, Mark Wong wrote:
  On Thu, 28 Jul 2005 17:17:25 -0500
  Jim C. Nasby [EMAIL PROTECTED] wrote:
  
   On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
 This 4-way has 8GB of memory and four Adaptec 2200s controllers 
 attached
 to 80 spindles (eight 10-disk arrays).  For those familiar with the
 schema, here is a visual of the disk layout:
   
 http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html
   
   Have you by-chance tried it with the logs and data just going to
   seperate RAID10s? I'm wondering if a large RAID10 would do a better job
   of spreading the load than segmenting things to specific drives.
  
  No, haven't tried that.  That would reduce my number of spindles as I
  scale up. ;)  I have the disks attached as JBODs and use LVM2 to stripe
  the disks together.
 
 I'm confused... why would it reduce the number of spindles? Is
 everything just striped right now? You could always s/RAID10/RAID0/.

RAID10 requires a minimum of 4 devices per LUN, I think.  At least 2
devices in a mirror, at least 2 mirrored devices to stripe.

RAID0 wouldn't be any different than what I have now, except if I use
hardware RAID I can't stripe across controllers.  That's treating LVM2
striping equal to software RAID0 of course.

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-27 Thread Josh Berkus
Mark,

 I'm starting to get results with dbt2 on a 4-way opteron system and
 wanted to share what I've got so far since people have told me in the
 past that this architecture is more interesting than the itanium2 that
 I've been using.

 This 4-way has 8GB of memory and four Adaptec 2200s controllers attached
 to 80 spindles (eight 10-disk arrays).  For those familiar with the
 schema, here is a visual of the disk layout:
   http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html

 Results for a 600 warehouse run are there:
   http://www.osdl.org/projects/dbt2dev/results/dev4-015/6/

 The tuning is a still a bit off, but feel free to let me know if there
 are any issues anyway.

This e-mail came in while I was away.   I, of course, am very interested in 
running tests on this machine.   Which version of PostgreSQL is this?  What 
configuration are you doing?  I would expect that we could get at least 7000 
on this platform; let me try to tweak it.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Testperf-general] dbt2 opteron performance

2005-07-27 Thread Mark Wong
On Wed, Jul 27, 2005 at 07:32:34PM -0700, Josh Berkus wrote:
 Mark,
 
  I'm starting to get results with dbt2 on a 4-way opteron system and
  wanted to share what I've got so far since people have told me in the
  past that this architecture is more interesting than the itanium2 that
  I've been using.
 
  This 4-way has 8GB of memory and four Adaptec 2200s controllers attached
  to 80 spindles (eight 10-disk arrays).  For those familiar with the
  schema, here is a visual of the disk layout:
  http://www.osdl.org/projects/dbt2dev/results/dev4-015/layout-6.html
 
  Results for a 600 warehouse run are there:
  http://www.osdl.org/projects/dbt2dev/results/dev4-015/6/
 
  The tuning is a still a bit off, but feel free to let me know if there
  are any issues anyway.
 
 This e-mail came in while I was away.   I, of course, am very interested in 
 running tests on this machine.   Which version of PostgreSQL is this?  What 
 configuration are you doing?  I would expect that we could get at least 7000 
 on this platform; let me try to tweak it.

It's dev4-015.  You should be able to login as root and create yourself
an account.  I'm not doing anything on the system now, so feel free to
poke around.  I've done a bad job of tracking what that first test was,
but recently I've been trying CVS from July 25, 2005, and also have that
base installed with v15 of the fast copy patch and Bruce's version of
the xlog patch.  I've only tried DBT2 on the system so far.

After seeing the discussion about how bad the disk performance is with a
lot of scsi controllers on linux, I'm wondering if we should run some
disk tests to see how things look.

Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org