Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-28 Thread Suchandra Thapa
On Mon, 2003-11-24 at 19:16, Greg Stark wrote:
> William Yu <[EMAIL PROTECTED]> writes:
> 
> > > You're right, though, mirroring a solid state drive is pretty pointless; if
> > > power fails, both mirrors are dead.
> > 
> > Actually no. Solid state memory is non-volatile. They retain data even without
> > power.
> 
> Note that flash ram only has a finite number of write cycles before it fails.
> 
> On the other hand that might not be so bad for WAL which writes sequentially,
> you can easily calculate how close you are to the maximum. For things like
> heap storage or swap it's awful as you can get hot spots that get written to
> thousands of times before the rest of the space is used.

I could be wrong, but I was under the impression that most of the newer
flash disks tended to spread writes out over the drive so that hotspots
are minimized.  

-- 
Suchandra Thapa <[EMAIL PROTECTED]>


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-26 Thread Dror Matalon

But the permissions of the base ramdisk might be wrong. I'd su to the
user that you run postgres as (probably postgres), and make sure that
you can go to the directory where the log and the database files are and
make sure you can see the files.

On Wed, Nov 26, 2003 at 10:03:47AM -0800, William Yu wrote:
> Tom Lane wrote:
> >William Yu <[EMAIL PROTECTED]> writes:
> >
> >>I then tried to put the WAL directory onto a ramdisk. I turned off 
> >>swapping, created a tmpfs mount point and copied the pg_xlog directory 
> >>over. Everything looked fine as far as I could tell but Postgres just 
> >>panic'd with a "file permissions" error. Anybody have thoughts to why 
> >>tmpfs would not work?
> >
> >
> >I'd say you got the file or directory ownership or permissions wrong.
> 
> I did a mv instead of a cp which duplicates ownership & permissions exactly.
> 
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-26 Thread William Yu
Tom Lane wrote:
William Yu <[EMAIL PROTECTED]> writes:

I then tried to put the WAL directory onto a ramdisk. I turned off 
swapping, created a tmpfs mount point and copied the pg_xlog directory 
over. Everything looked fine as far as I could tell but Postgres just 
panic'd with a "file permissions" error. Anybody have thoughts to why 
tmpfs would not work?


I'd say you got the file or directory ownership or permissions wrong.
I did a mv instead of a cp which duplicates ownership & permissions exactly.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-26 Thread Tom Lane
William Yu <[EMAIL PROTECTED]> writes:
> I then tried to put the WAL directory onto a ramdisk. I turned off 
> swapping, created a tmpfs mount point and copied the pg_xlog directory 
> over. Everything looked fine as far as I could tell but Postgres just 
> panic'd with a "file permissions" error. Anybody have thoughts to why 
> tmpfs would not work?

I'd say you got the file or directory ownership or permissions wrong.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-26 Thread William Yu
Josh Berkus wrote:
William,


When my current job batch is done, I'll save a copy of the dir and give
the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
store and run that through the hooper.


We'll be interested in the results.   The Sandisk won't be much of a 
performance test; last I checked, their access speed was about 1/2 that of a 
fast SCSI drive.   But it could be a feasability test for the more expensive 
RAMdrive approach.
Some initial numbers. I simulated a CPU increase by underclocking the 
processors. Most of the time, performance does not scale linearly with 
clock speed but since I also underclocked the FSB and memory bandwidth 
with the CPU, it's nearly an exact match.

1.15GHz6.14
1.53GHz6.97 +33% CPU = +13.5% performance
I then simulated adding a heapload of extra memory by running my job a 
second time. Unfortunately, to keep my 25GB DB mostly cached in memory, 
the word heapload is too accurate.

Run 1  6.97
Run 2  7.99 +14%
I popped in an extra IDE hard drive to store the WAL files and that 
boosted the numbers by a little. From looking at iostat, the ratio 
looked like 300K/s WAL for 1MB/s data.

WAL+Data on same disk6.97
WAL+Data separated   7.26 +4%
I then tried to put the WAL directory onto a ramdisk. I turned off 
swapping, created a tmpfs mount point and copied the pg_xlog directory 
over. Everything looked fine as far as I could tell but Postgres just 
panic'd with a "file permissions" error. Anybody have thoughts to why 
tmpfs would not work?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-25 Thread Tom Lane
Shridhar Daithankar <[EMAIL PROTECTED]> writes:
> William Yu wrote:
>> This is an intriguing thought which leads me to think about a similar 
>> solution for even a production server and that's a solid state drive for 
>> just the WAL. What's the max disk space the WAL would ever take up? 

> Maximum number of WAL segments at any time in 2*(number of checkpoint 
> segments)+1 IIRC.
> So if you have 3 checkpoint segments, you can not have more than 7 WAL
> segments at any time. Give or take 1.

I don't believe that's a *hard* limit.  The system tries to schedule
checkpoints often enough to prevent WAL from getting bigger than that,
but if you had a sufficiently big spike in update activity, it's at
least theoretically possible that more than checkpoint_segments segments
could be filled before the concurrently running checkpoint finishes and
releases some old segments.

The odds of this being a real problem are small, especially if you don't
try to fit on an undersized SSD by reducing checkpoint_segments.  I'd
think that a 512Mb SSD would be plenty of space for ordinary update load
levels ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread Shridhar Daithankar
William Yu wrote:

This is an intriguing thought which leads me to think about a similar 
solution for even a production server and that's a solid state drive for 
just the WAL. What's the max disk space the WAL would ever take up? 
There's quite a few 512MB/1GB/2GB solid state drives available now in 
the ~$200-$500 range and if you never hit those limits...
Maximum number of WAL segments at any time in 2*(number of checkpoint 
segments)+1 IIRC.

So if you have 3 checkpoint segments, you can not have more than 7 WAL segments 
at any time. Give or take 1.

Correct me if I am wrong..

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread Greg Stark
William Yu <[EMAIL PROTECTED]> writes:

> > You're right, though, mirroring a solid state drive is pretty pointless; if
> > power fails, both mirrors are dead.
> 
> Actually no. Solid state memory is non-volatile. They retain data even without
> power.

Note that flash ram only has a finite number of write cycles before it fails.

On the other hand that might not be so bad for WAL which writes sequentially,
you can easily calculate how close you are to the maximum. For things like
heap storage or swap it's awful as you can get hot spots that get written to
thousands of times before the rest of the space is used.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread William Yu
Josh Berkus wrote:
William,


The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could
get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You
shouldn't need to do mirroring with a solid state drive.


I wouldn't count on RAID0 improving the speed of SANDisk's much.  How are you 
connecting to them?  USB?   USB doesn't support fast parallel data access.
You can get ATA SanDisks up to 2GB. Another vendor I checked out -- 
BitMicro -- has solid state drives for SATA, SCSI and FiberChannel. I'd 
definitely would not use USB SSDs -- USB performance would be so pokey 
to be useless.

Now, if it turns out that 256MB ramdisks are less than 1/5 the cost of 1GB 
ramdisks, then that's worth considering.
Looks like they're linear with size. SanDisk Flashdrive 1GB is about 
$1000 while 256MB is $250.

You're right, though, mirroring a solid state drive is pretty pointless; if 
power fails, both mirrors are dead.  
Actually no. Solid state memory is non-volatile. They retain data even 
without power.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread Josh Berkus
William,

> The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could
> get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You
> shouldn't need to do mirroring with a solid state drive.

I wouldn't count on RAID0 improving the speed of SANDisk's much.  How are you 
connecting to them?  USB?   USB doesn't support fast parallel data access.

Now, if it turns out that 256MB ramdisks are less than 1/5 the cost of 1GB 
ramdisks, then that's worth considering.

You're right, though, mirroring a solid state drive is pretty pointless; if 
power fails, both mirrors are dead.  

As I said before, though, we're all very interested in this test.  Using a 
ramdisk for WAL has been discussed on this list numerous times but not 
attempted by anyone who published their results.

All that aside, though, I think you should also experiment with the Background 
Writer patch recently discussed on Hackers, as it may give you a performance 
boost as well.
-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread William Yu
Josh Berkus wrote:
William,


When my current job batch is done, I'll save a copy of the dir and give
the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
store and run that through the hooper.


We'll be interested in the results.   The Sandisk won't be much of a 
performance test; last I checked, their access speed was about 1/2 that of a 
fast SCSI drive.   But it could be a feasability test for the more expensive 
RAMdrive approach.



The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could 
get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You 
shouldn't need to do mirroring with a solid state drive.

Time to Google up some more solid state drive vendors.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread Josh Berkus
William,

> When my current job batch is done, I'll save a copy of the dir and give
> the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
> store and run that through the hooper.

We'll be interested in the results.   The Sandisk won't be much of a 
performance test; last I checked, their access speed was about 1/2 that of a 
fast SCSI drive.   But it could be a feasability test for the more expensive 
RAMdrive approach.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread William Yu
This is an intriguing thought which leads me to think about a similar 
solution for even a production server and that's a solid state drive for 
just the WAL. What's the max disk space the WAL would ever take up? 
There's quite a few 512MB/1GB/2GB solid state drives available now in 
the ~$200-$500 range and if you never hit those limits...

When my current job batch is done, I'll save a copy of the dir and give 
the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local 
store and run that through the hooper.

Shridhar Daithankar wrote:
Mount WAL on RAM disk. WAL is most often hit area for heavy 
updates/inserts. If you spped that up, things should be pretty faster.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread Bill Moran
William Yu wrote:
My situation is this. We have a semi-production server where we 
pre-process data and then upload the finished data to our production 
servers. We need the fastest possible write performance. Having the DB 
go corrupt due to power loss/OS crash is acceptable because we can 
always restore from last night and re-run everything that was done since 
then.

I already have fsync off. Short of buying more hardware -- which I will 
probably do anyways once I figure out whether I need more CPU, memory or 
disk -- what else can I do to max out the speed? Operation mix is about 
50% select, 40% insert, 10% update.
In line with what Tom Lane said, you may want to look at the various
memory databases available (I'm not familiar with any one to recommend,
though)  If you can fit the whole database in RAM, that would work
great, if not, you may be able to split the DB up and put the most
used tables just in the memory database.
I have also seen a number tutorials on how to put a database on a
RAM disk.  This helps, but it's still not as fast as a database server
that's designed to keep all its data in RAM.
--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-23 Thread Shridhar Daithankar
William Yu wrote:

My situation is this. We have a semi-production server where we 
pre-process data and then upload the finished data to our production 
servers. We need the fastest possible write performance. Having the DB 
go corrupt due to power loss/OS crash is acceptable because we can 
always restore from last night and re-run everything that was done since 
then.

I already have fsync off. Short of buying more hardware -- which I will 
probably do anyways once I figure out whether I need more CPU, memory or 
disk -- what else can I do to max out the speed? Operation mix is about 
50% select, 40% insert, 10% update.
Mount WAL on RAM disk. WAL is most often hit area for heavy updates/inserts. If 
you spped that up, things should be pretty faster.

A non-tried advice though. Given that you can afford a crash, I would say it is 
worth a try..

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-23 Thread Joe Conway
William Yu wrote:
My situation is this. We have a semi-production server where we 
pre-process data and then upload the finished data to our production 
servers. We need the fastest possible write performance. Having the DB 
go corrupt due to power loss/OS crash is acceptable because we can 
always restore from last night and re-run everything that was done since 
then.
If you can, use COPY -- it is far faster than INSERT.

See:
http://www.postgresql.org/docs/current/static/sql-copy.html
HTH,

Joe



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-23 Thread Josh Berkus
William,

> I already have fsync off. Short of buying more hardware -- which I will
> probably do anyways once I figure out whether I need more CPU, memory or
> disk -- what else can I do to max out the speed? Operation mix is about
> 50% select, 40% insert, 10% update.

Disk.   Multi-channel RAID is where it's at, and/or RAID with a great write 
cache enabled.  For really fast updates, I'd suggest 6-disk or even 8-disk 
RAID 1+0.

As soon as you have gobs of extra disk space, jack your checkpoint_buffers way 
up, like a couple of gigs.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-23 Thread Tom Lane
William Yu <[EMAIL PROTECTED]> writes:
> [ we don't care about data integrity ]
> I already have fsync off. Short of buying more hardware -- which I will 
> probably do anyways once I figure out whether I need more CPU, memory or 
> disk -- what else can I do to max out the speed? Operation mix is about 
> 50% select, 40% insert, 10% update.

Batch operations so you commit more than one insert per transaction.
(With fsync off, this isn't such a killer consideration as it would be
with fsync on, but the per-transaction overhead is still nontrivial.)

Get rid of as many integrity constraints as you feel can reasonably be
postponed to the final upload.  FK checks are particularly painful.

Eliminate indexes where possible.

Also (I hate to say this, but...) you should consider using Some Other
Database.  "I don't care about data integrity, only speed" sounds like
a good fit to MySQL ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster